FINAL  REPORT 
Schemas  in  Problem  Solving: 

An  Integrated  Model  of  Learning,  Memory,  and  Instruction 


Sandra  P.  Marshall 


December  1991 


OTIC 

WARa4  19921  1 

M  U 


Center  for  Research  in  Mathematics  and  Science  Education 


College  of  Sciences 
San  Diego  State  University 
San  Diego,  CA  92182-0413 


This  research  is  s^ported  hy  the  Office  of 
Naval  Research,  Cfognitive  Sciences  Program, 
Contract  No.  N00014-9()-J-1143. 

Reproductions  in  whole  or  part  is  permitted  for 
any  purpose  of  the  United  States  Government. 

Approved  for  public  release;  distribution 
unlimited. 


*  9'?'  A4Q 


92-05117 


REPORT  DOCUMENTATION  PAGE 


forr.'.  Apfifoind 
OM8  No.  0T044tf8» 


»uciict«eoniiiqb>i>a«i>tqf  winceincnewofmtflnwiiiowmwwwf  wmmi  i  Houf»nwoi>»t.Wdiidi»i»«ttw<ar»  . 
aiinwim  tMwwiwm  Vf  «m  rmd«<.  »M  tanoumnt «—  miwwim  im  toOniwii  at  liiWtwMio^  IMCOMm* 

<oiM<i>on  o<  ManaMion.  inciudMM  MogntNm  for  W>  kwMi.  IS  MMrtMfnn  MMritmntn  StMOU  OMMXNt  Mr  • 

Om  MglMMy.  5uti»  1204. /tfNngun.  Vi  1U02-41U,  Mrt  w  «t  0m««*  MOTtfiMtm 


larMoodMrMBKtofi  . 
(MWOfO.  U<lJ*t«tnan 
.0C2M03. 


1.  AfiCNCY  USI  ONLY  (LMvt  bUnk) 


2.  MraitT  oAn 
January  1992 


3.  TVff  AmI)  oath  COVfRfO 

Final  Report  10/1/89  -  12/31/91 


4.  Tnu  AND  SUSTITLE 

FINAL  REPORT  Schemas  in  Problem  Solving:  An 
Integrated  Model  of  Learning,  Memory,  amd 
Instruction 


C.  AUTHOIKS) 

Sandra  P.  Marshall 


S.  FUIHMM  HUMMUS 

N00014-89-J-1143  G 
442c010 — 6 


7.  PUFORMING  ORGANIZATION  NAINE(S)  AND  AOOMSS^ISi 

Department  of  Psycho logy /CRMSE 
San  Diego  State  University 
Saui  Diego,  CA  92182 


a.  mFONMMG  0R6ANI2ATI0N 
MMMT  NUtMiR 


CRMSE  Report  91-02 


9.  SPONSORING /MONITORING  AGENCY  NAMC(S)  ANO  AOORCSS(ES) 

Office  of  Naval  Research 
Cognitive  Science  Program  (Code  1142CS) 
800  N.  Quincy  Street 
Arlington,  VA  22217-5000 


ia  SRONSORStG  /  MONirORING 
AGENCY  RE RORT  NUMIER 


11.  SURKEMENTARY  NOTES 


12a.  OiSTRIRunON/AVAILAMUTV  S^ATERIENT 

i^iproved  for  public  release r  distribution 
unlimited. 


12lL  DSnauTION  CODE 


13.  AlsillACT  (Maximum  200 wOfOU  — 

This  final  report  contains  two  papers.  The  first  describes  statistical 
and  cognitive  models  used  to  simulate  student  perf ormamce .  The 
statistical  model  provides  information  eUtiout  how  the  group  of  students 
as  a  whole  performed  on  an  identification  task  involving  word-problem 
situations  and  shows  differences  Eusiong  subgroups.  The  cognitive  model 
is  a  connectlonlst  network  that  simulates  the  performEuice  of  each 
student  Euid  yields  details  about  how  learning  varied  from  one 
Individual  to  another.  The  second  paper  outlines  a  hybrid  model  of 
schema  knowledge  that  joins  a  connectlonlst  network  with  a  production 
system.  Details  of  the  model  are  provided,  and  an  example  of  its 
output  is  presented. 


14.  suuia  rswM 

cognitive  model,  problem  solving,  hybrid  BK>del 


17.  MC^Rin  dA^McAfiSf 
ORRERORT 


RAGES 


NSN  7S40-O1-2W  SS0O 


StWiMfd  Form  2M  (Rmr  2-99) 

Aiwwia  *1  An«  $m.  !]«.<• 

7W.IV 


FINAL  TECHNICAL  REPORT 


Grant  No.- 
Period: 

Date  qf  Submission: 
Name  qf  Institution: 
Title  of  Project: 

Principal  Investigator: 


ONRN00014-89-J-1143 
October  1. 1989  -  December  31, 1991 
Januafy  2S.  1992 
San  Diego  State  University 

Schemas  in  PioUem  Solvbog;  An  Integrated  Model  of 
Memory,  Leamii^,  and  Instruction 
Sandra  P.  Marshall 


Table  qf  Contents 


Project  Sumnuay .  1 

Project  Publications  and  Reports .  3 

Statistical  and  Cognitive  Models  of  Learning  through  Instructi(Hi .  5 

(Sandra  P.  Marshall) 

Introduction .  5 

The  Ejqrerimeiit . 10 

Ststtisdcal  Anaiy^ .  17 

Ihe  Cognitive  Model .  20 

Discussicm .  29 


Ptoblem-Stdving  Schemas:  Hybrid  Models  of  Cognitirm. .  33 

(Sandra  P.  Marshall  &  John  P.  Mmshall) 


Overview .  35 

Ihe  Performance  Modd  of  Constraint  Knowledge .  36 

The  Leuning  hfodel .  37 

The  Hybrid  Modd  . 45 


geooBSlon  For 

itIS  ORAhl 

IT 

OTIC  TAB 

□ 

Uoaimouncod 

□ 

Justltloatlon — 

^  Pltrtbutioo/ 

AvalLnblllty  Codes 


Project  Summary 


(Summary  of  research  carried  out  under  ONR  Contract  No.  N(XX)14-85-K-0661  and 
amtbtued  under  ONR  Grant  No.  N00014-90-J‘1143) 

Tliis  document  serves  as  the  final  tedmical  rqxxt  of  ONR  Grant  No.  N00014- 
90-J-1143,  which  was  fiinded  as  a  continuation  of  work  carried  out  unda  ONR  Omtract 
No.  N00014-8S-K-  0661.  As  part  cS  file  initial  project  I  developed  a  schema-based 
model  of  teaching  and  learning  fix  file  domain  of  arifiimetic  word  problems  (Marshall, 
Prlbe,  &  Smith,  1987).  The  schemas  enqhasize  file  basic  situations  fiiat  can  be 
ccmtaitied  in  such  problems.  A  central  focus  of  file  research  was  to  create  a  mcxld  that 
applied  equally  well  to  issues  of  memory  organizaticm,  teaching  and  learning, 
instrucficmal  development,  and  diagnosis  of  studmt  learning. 

A  core  set  of  situations  was  idemified,  and  a  series  of  studies  verified  that  fiie 
situations  were  sufficioit  fix  describing  virtually  all  legitimate  wcxd  proUems  (Marshall, 
1990).  A  model  of  schema  knowledge  was  constructed  fix  each  of  file  basic  situations. 
Each  schema  model  qwcified  the  feature  knowledge,  ccmstraint  knowledge,  {danning 
knowledge,  and  iinidementation  knowlec^  recpiired  to  use  the  schema  successfully.  An 
extensicxi  of  the  bi^c  schema  model  yielded  ways  in  rdilch  affective  components  may 
also  be  part  of  schema  knowledge  (Msashall,  1989).  Attention  was  also  given  to  ways  in 
^ch  different  types  of  schema  knowledge  could  be  easily  assessed  (Marshall,  1988). 
Mote  recently,  I  have  demonstrated  fitat  the  schema  fiieocy  can  be  applied  easily  to  two 
ocher  domains,  elementary  statistics  and  rational  munber  instruction  (Marshall,  in  ^ess 
a). 


The  instructiooal  system,  called  STORY  PROBLEM  SOLVER  (SPS),  was 
designed  to  provide  instrucficm  about  fiiese  situ^ons  in  such  a  way  as  fi>  foster  the 
development  ctf  sfjpropriaiB  schemas  by  individoals  (Marshall,  Batfiudi,  Brewer,  &  Rose, 
1989).  The  systaa  consists  of  (a)  a  series  of  letBons  requiring  about  6-8  hours  fix 
Gomidetioo  and  (b)  a  flexible  pcoUem-solving  environment  Both  of  fiiese  compcments 
were  designed  to  fbcus  cm  spedflc  aspects  of  schema  knowlec^  req^iired  in  scdving 
problems. 

In  file  lessons,  each  component  ctf  sdiema  knowledge  was  addressed  im{dicitly 
through  short  Insttucfional  s^ments  and  related  exercises.  Studeitts  were  introduced  to  a 
set  of  icons  dqiicting  the  sttuafions,  and  fiiey  were  encouraged  to  use  the  icons  to 
represent  file  various  situatioos  oocmririg  in  spedflc  probtems.  A  set  ctf  eiqieriments 
revealed  fiiat  students  cBd  devdop  die  spedflc  of  schema  knondefige  targeted  by 
SPS  and  that  file  toons  were  a  part  of  their  kmwlec^e  (Marshdl  ft  Brewer,  1990). 
Moreover,  we  were  ahte  to  diact  the  devdopment  of  schemas  over  die  course  of 
instnictioo  fixou^  ineflvidual  intervfews  with  our  sutjects  (Marshall,  in  press  b). 

The  second  part  of  the  syaaesB  fat  a  flexible  Prbblem-Soivli^  Envhonment,  PS£; 
in  whfeh  studeatt  can  wtperiment  with  problem  repnesentaflons  tqr  manipolating  the 
icona  deaottied  abofve.  Scadsaa  am  aUe  to  select  a  aabaet  of  loons  to  represent  a 
probieaB  and  to  Bnk  these  togedier  to  npneat  the  cooBecdoos  in  the  peobleia.  Th^ 
have  opdoae  to  ei^pand  dto  loiMB  aito  captata  todMdsal  aspects  of  each  one,  to  cany  out 
csiqdaitoaa.  to  adect  dfeer  Ictme  if  fliey  so  daihe,  or  to  have  toe  system  diapiqr  a 
poasHils  ifprratassilnB  of  toe  psOUesa.  TUs  savimaaBeai  was  devdoped  under  the 


1 


ocigiiial  ONR  C(Mtract  and  evaluated  under  die  project  continuation  as  Grant  N00014- 
90-M143  (NfoishaU,  1991). 

The  projea  yielded  three  major  products.  Rist,  I  created  a  working  omiputer- 
based  system  of  instruction  that  can  be  used  to  teach  students  about  solving  word 
proUems.  The  system  has  been  used  successfully  with  about  100  subjects  to  date 
(primatily  college  studems  with  weak  ptoUem  srdvi^  skills).  Seomd,  I  have  devdoped 
and  refined  a  theory  of  schema  structure  and  acquisitkHi.  The  theory  builds  on  die 
general  nature  of  sdiema  knowled^  found  in  the  cognidve  science  and  cognidve 
psydKdogical  literature  but  goes  considerably  beyrmd  it  In  particular,  the  dieory  allows 
operational  definition  of  key  oompcments  of  a  sdiema  and  thus  allows  enqiiiicai  tests  of 
whedier  individuals  have  acquired  diese  itieces.  Third,  as  a  direct  conserpience  of 
studying  the  acquisitirm  of  schema  knowledge  and  attenqiting  to  evalude  students' 
learning,  I  have  formulated  a  new  model  of  assessment  The  model  is  a  networic  model, 
and  it  stipulates  the  need  for  assessing  bodi  the  number  of  nodes  and  die  connectivity 
within  ^  net  Thus,  the  project  results  allow  us  to  use  die  dieoty  of  memory 
organization  (i.e.,  sdiema  the^)  to  model  learning,  instiuctimi,  and  assessment  This 
last  result  has  had  the  most  fiar-readiing  impact  As  can  be  seen  fimn  die  attached  list  of 
putdications  and  presentations,  I  have  been  invited  to  make  a  number  of  contributions 
about  assessir^  sdiema  knowledge.  The  importance  here  is  that  die  theory  devdoped 
during  this  project  is  unique  in  its  use  of  a  cmnmon  model  for  learning,  instructimi,  and 
assessment  Moreover,  the  dieory  provides  die  basis  for  a  linkage  between  a 
psydKdogical  theory  of  memory/learning  and  a  new  psydiometiic  theory  of  testing. 

Finally,  the  project  also  yielded  several  important  modeling  results.  We  have 
simulated  sucoessftilly  the  performance  of  students  as  they  re^xmd  to  die  computer 
exerdses.  The  simulatimi  uses  estimates  of  didr  sdiema  knowledge  as  revealed  in 
interviews.  Bodi  correct  and  incorrect  reqiotises  are  equally  well  estimated.  We  have 
also  enqiloyed  a  series  of  connectionist  models  Miidi  learn  to  dassify  the  situations 
expressed  in  story  proUems.  The  modeling  continued  under  the  project  renewal  and  is 
the  focus  oi  the  report  wfaidi  follows.  The  report  has  two  sectkms.  The  first  section 
describes  statistical  and  cognitive  models  of  performance  on  die  initial  reopgnitkm  task 
in  die  instnictimial  system.  The  modds  desoibed  therein  successfiiUy  siinulate  actual 
student  performance  on  an  item-by-item  basis.  The  second  section  describes  die  fiill 
model  of  sdiema  instantiatirm.  This  is  a  hybrid  inodel,  iiicocpotating  both  a  production- 
system  and  a  connectionist  network.  It  successfidly  evaluates  multi-step  stery  problems, 
rect^nizes  their  important  relational  oomponents,  and  solves  the  proMems  for  die  correct 
numerical  soiutimi. 

At  the  end  of  this  summary  is  a  Ust  pttbiicatioos,  tedmical  reports,  oonfetence 
presentations,  and  invited  addresses  that  report  researdi  fiom  these  two  projects.  Much 
(tf  the  work  spanned  both  of  them.  Most  of  the  leseatch  results  will  be  rqweted  in  a 
book  now  bd^  prepared  for  pobUcation  by  die  Cambridge  University  Press.  The  book 
rinold  be  oonqdded  by  Sqitember  1992. 


2 


Project  Pablicatioiis  and  Rq>orts 


PnMIcafioiu: 

Mardiall,  Sandra  P.  (1988).  Assessing  proUem  serving:  A  short-term  remedy  and  a 
long-term  soludon.  In  R.  I.  Charles  &  E.  A.  Silver  (Eds.),  The  Teaching  and 
Assessing  of  Mathematical  Problem  S(dving.  Hillsdale,  NJ:  Lawrence  Erlbaum 
Assodales. 

Marshall,  Sandra  P.  (1989).  Affect  in  sdiema  knowlec^ie:  Source  and  impact  InD.  B. 
McLeod  &  V.  M.  Adams  (Eds.),  ^ect  and  mathematical  problem  solving.  New 
York:  Springer  Veriag. 

Marshall,  Sandra  P.  (1990).  The  assessment  of  schema  knowledge  for  arithmetic  story 
problems:  A  cognitive  sdenoe  perspectiw.  In  G.  Kulm  (Ed.).  Assessing  higher 
order  thinUng  in  mathematics.  Washingtem,  D.C.:  AAAS.  [a] 

Marshall,  Saixfea  P.  (199J).  Generating  good  items  for  diagnostic  tests.  In  N. 

Fkederiksen,  R.  Glaser,  A.  Lesgtdd.  &  M.  Shafto  (Eds.),  Diagnostic  Monitoring  of 
Skill  and  Knowledge  Acquisition.  Hillsdale,  NJ:  Lawrence  Erlbaum  Associates,  [b] 

Marshall,  Sandra  P.  (in  press).  Assessing  sdiema  knowledge.  In  N.  I^ederiksen.  R. 
Mislevy.  &L  Bejar(Eds.),  TestTheory  for  a  New  Generation  cf  Tests.  Ifillsdale. 
NJ:  Lawrence  Erlbaum  Associates,  [a] 

Marshall,  Sandra  P.  (in  press).  Assessment  of  radmial  number  understanding:  A  sdiema- 
based  approach.  In  T.  Carpenter,  E.  Fdinema,  &  T.  Romberg,  Rational  Numbers: 

An  bitegration  of  Research.  Hillsdale,  NJ:  Lawrence  Erlbaum  Assodates.  [b] 

MatAaU,  Sandra  P.  On  press).  Statistical  and  cognitive  models  of  learning  through 
instruction.  To  appear  in  Nfeyrowitz,  A.  L.  &  CMianan,  S.  (Eds.),  Cognitive  Models 
of  Complex  Learning.  Norwell,  MA:  Kluwer  Academic  PubUsheis.  [c] 

TeehstieMKeports: 

MaeshaO,  Sandra  P.,  Piibe,  Cteistopber  A.,  &  Smidi,  Julie  D.  (1987).  Sdiema 
Knowledge  Structures  for  Representing  and  Understanding  Arithmetic  Story 
Problems. 

Mndiall,  Sandra  P.  (1988).  Assessing  Sdiema  Knowledge. 

ManhdLSttidtaP.  (1988).  Schema  Knowl^^  for  SoMngArWmetic  Sony  Problems: 
Some  Affective  Componeius. 

ManhaB.  Sandka  P.,  Baffbrit,  Kadnya  E.  ftewer,  Margaret  A.,  ft  Rose,  Rededc  E. 
(1989).  STORY  PilOflJLOf  SOLVER.'  A  sdienuhbasedsjMm  of  instruction. 

MfRian,  Saada  P.  (1991).  Computer-Based  Assessment  of  Sdiema  JDwnvteHie  fa  a 
PlesMe  ProtitiihSMvIug  SuvIfOUHiOHL 


3 


Prestntations: 


Marshall,  Sandra  P.  (1987,  Knowle^e  lepiesentaticm  and  mots  of  proUem 

scdving:  Identifying  misoonceptims.  In  W.  Mmtague  (Chair),  Diagnosing  Errors 
in  Science  and  Ma^iematics.  Symposium  conducted  at  die  Annual  Meeting  of  die 
American  Educadcm^  Researdi  Assodadon,  Washingtcm,  D.C. 

Marshall,  Sandra  P.  (1988,  April).  Assessing  sdiema  knowledge.  In  N.  Fiedoiksen 
(Clmir),  Test  Theory  for  Tests  Based  on  Cognitive  Theory.  Syn^iosium  conducted  at 
the  Annual  Meeting  of  die  American  Educational  Researdi  Assodadon,  New 
Orleans. 

Marshall,  Sandra  P.  (1989,  January).  The  assessmeitt  of  sdiema  knowledge  for 
arithmedc  story  problems.  In  G.  Kulm  (Chair),  Perspectives  and  Emerging 
Approaches  for  Assessing  Higher  Order  Thinking  in  Mathematics.  Symposium 
conducted  d  die  Annual  Meeting  of  the  American  Assodadon  fix  the  Advancement 
of  Sdence  (AAAS),  San  Frandsco. 

Marshall,  Sandra  P.  (1990,  April).  What  students  learn  (and  remember)  from  word 
ptoUem  instrucdm.  In  S.  Chipman  (Chair),  Penetrating  to  the  Mathematical 
Structure  cf  Word  Pndtlems.  Symposium  cmducted  at  the  Annual  Meedng  of  the 
American  Educadonal  Research  Assodadcm,  Bosum. 

Marshall,  Sandra  P.  (1991.  April).  Coo^iioter-based  assessment  of  sdiema  knowledge  in 
a  flexiUe  ptoUem*sOlving  environmenL  In  H.  F.  (}NeU,  Jr.  (Chair),  Extending  the 
Frontierscf  Alternative  Assessment  with  CiunputerTedmotogy.  Symposium 
conducted  at  the  Annual  Meeting  of  the  American  Educadonal  Research  Assodadim, 
BostoiL 


Invited  Addresses: 

"Remedial  Instiucdon  fix  Arithmedc  Suiry  Protdems;  A  Cognidve  Sdence  Approach." 
Invited  address  to  die  Nadonal  Council  of  Teachers  of  Mathemadcs,  Chicago,  April 
1988. 

"Schema  Knowledge".  Invited  presentadon  to  the  Resource  Center  fix  Sdence  and 
Engineering,  University  of  Puerto  Rico,  November  1989. 

VapmUisked  Masmsertpts: 

MatAall,  Sandra  P.  (1991).  Understanding  die  situations  ofaridunetic  word  problems: 
A  basis  for  sdiema  knowledge. 

Marriian,  Saodn  P.  &  Brewer.  Manfaiet  A.  (1990).  Learning  from  icons:  What  you  see 
isvdiatyouget,orisit? 


WoMstgdOe:  Sdiemas  In  ProUm  SoMng:  An  Integrated  Model  of  b^ruokm. 
Learning,  and  AuetmenL  IbbeptfiUsliedbyCuitiridlieUnlvetsttyPRat. 


4 


Statistical  and  Cc^itive  Models  of 
Learning  throogh  Instruction^ 


Sandra  P.  Marshall 
Dqiartment  of  Psychology 
San  Diego  State  University 
San  Diego.  CA  92182-031S 

Abstract 

This  chapter  uses  statistical  and  cognitive  models  to  evaluate  die 
learning  of  a  set  of  oonoqits  about  arldimetic  wad  problems  by  a  groiq>  of 
smdents.  The  statistical  model  provides  information  about  bow  the  gronqi  of 
students  as  a  ahcde  pertbniMd  an  an  identification  task  involving  word- 
problem  situadons  and  Shows  differences  among  subgroups.  The  cognitive 
modd  simulates  the  performance  of  eadi  student  and  yidds  details  about 
bow  learning  varied  fiom  one  individual  to  another.  It  is  a  connecfionist 
modd  in  whidi  die  middle  layer  of  units  is  specified  a  priori  for  each 
smdent.  according  to  the  student's  levd  of  understanding  expressed  in  an 
interview.  The  chapter  concludes  widi  a  detailed  comparison  of  the 
simulated  responses  widi  the  observed  student  re^pcmses. 


INTRODUCTION 

The  learning  invesdgated  here  occurred  as  part  of  a  study  in  ndiicb 
students  recdved  computer-based  instrucdon  about  arithmetic  wad 
problems.  The  central  topics  of  the  instrucdon  were  five  basic  situadons 
that  occur  with  great  fiequency  in  word  problems:  Change,  Group. 
Compare.  Restate,  and  Vary.  The  imtrucdon  bad  dnee  main  segments: 
(1)  the  introducdon,  in  wtiidi  the  situadons  were  described;  (2)  an  in-dqith 
eiqdoradon.  in  which  details  of  each  situmioo  were  eldiorated  and  {xesented 
dikgrammadadly;  and  (3)  the  synthesis,  in  wMdi  canbinadons  of  situadons 


*  To  qipear  in  Meyrowitz.  A.  L.  A  Chipman,  S.  (Eds.).  Coffiith/e  Moddt  of 
Corr^kx Learning  Narwe&,MA:  fOnwerAcadinnicPublidieis. 


5 


were  introduced  together  with  planning  and  goal-setting  techniques.^  Fa- 
each  of  the  three  parts  of  instruction,  students  engaged  in  multiple  isactice 
exercises.  The  study  rqxxted  in  diis  di^xer  concerns  only  first 
segment  of  instruction~the  introduction  to  the  situati(ms-and  the  primary 
focus  is  file  nature  of  the  knowledge  fiiat  individuals  gained  fix>m  that 
introductory  instruction. 

This  du^iter  describes  two  analyses  of  u4iat  individuals  learn  fi?om 
instruction.  Both  analyses  are  needed.  In  file  first  case,  learning  is 
examined  in  a  tracUtional  experimental  paradigm,  using  established 
statistical  procedures.  Group  features,  rather  than  individual  diaracteristics, 
receive  greater  em^fiiasis  in  fiiis  paradigm,  and  crmclusions  drawn  from  the 
analysis  describe  group  commonalities.  In  the  second  case,  learning  is 
examined  by  means  of  a  cognitive  model  that  simulates  individual 
performance.  In  fills  analysis,  individuals'  characteristics  are  studied,  and 
conclusions  apply  sqiatatdy  to  eadi  individual.  As  I  indicate  below,  the 
informatiai  gained  finom  each  analysis  is  valuable  in  a  study  of  learning. 
Neifiier  (me  alone  provides  the  comidete  jdcture. 

The  questions  of  interest  in  the  research  are  what  is  the  new  knowledge 
retained  in  memory  as  a  result  of  instructicm.  when  is  it  retained,  and  which 
parts  of  it  are  later  accessed  and  retrieved.  During  instruction,  some  new 
informaticm  is  (presumably)  acquired  and  added  to  an  individual’s  available 
knowledge  store.  Not  all  possitde  infbnnation  is  taken  in,  and  individuals 
vary  in  the  type  and  amount  of  new  knowledge  that  enter  memory.  It  is  the 
rare  instance  in  udiich  all  tearners  learn  exactly  the  same  thing  fiom  a  single 
instructional  lessoa  More  (fiten,  some  learners  noticeably  remember  a  great 
deal  of  the  new  informaticm  while  ofiiers  remember  almost  nothing. 

The  Instructional  Domain 

This  section  provides  a  dioct  descripticm  of  fite  five  situations  used  in 
instruction.  The  sitnaticmB  are  Change,  Group,  Constate,  Restate,  and 
Vary,  and  they  represent  uniquely  almost  all  simple  stories  found  in 
arithmetic  story  proMems  (MarAall,  1991). 

The  Chnnge  situation  is  characterized  a  permanent  alteration  over  time 
in  a  measurable  quantity  of  a  single,  specif  fifing.  Only  the  quantity 
associated  with  one  fifi^  is  involved  in  the  Change  situation.  It  has  a 


^  Detidb  fibont  the  compnter-based  hntraction  can  be  found  in  Marshall, 
BarthuK,  Brewer,  dt  Rose  (P§9),  a  technkad  report  avaSabfe  from  file  aufiior. 


6 


begiiming  state  and  an  end  state,  with  scnne  intervention  which  causes  a 
transition  from  beginning  to  end.  Usually,  three  numbers  ate  of  importance: 
die  amount  prior  to  the  change,  the  extent  of  die  change,  and  the  resulting 
amount  after  the  change  has  occurred. 

A  Group  situadcm  is  piesem  if  a  number  of  small  distinct  sets  are 
combined  meaningfully  into  one  large  aggregate.  Thus,  the  Group  situadon 
reflects  class  inclusion.  The  groiqdng  may  be  eiqdicit  or  implicit  If 
eiqilicit  the  solver  is  tcdd  in  the  protdem  statement  whidi  small  groups  are 
to  be  united.  If  imidicit  the  solver  must  rdy  on  his  or  her  prior  semantic 
knowledge  to  understand  the  group  structure.  For  example,  in  a  situation 
involving  boys  and  girls,  the  solver  would  typically  be  expected  to  know 
that  boys  and  girls  form  a  larger  class  called  children.  The  solver  also 
would  be  expected  to  understand  that  die  members  of  die  subgroups  (i.e., 
boys  or  girls)  retain  dieir  idendty  even  when  combined  into  a  larger  group 
(i.e.,  children).  Three  or  mcxe  numbers  are  necessary  in  a  Group  situation: 
the  numbo*  of  members  in  each  of  the  subgroups  as  well  as  the  overall 
number  in  the  combination. 

The  Compare  situation  is  one  in  which  two  things  are  contrasted  to 
determine  which  is  greater  or  snuiller.  The  numerical  size  of  the  difference 
between  the  values  is  unimportant  and  may  not  even  need  to  be  computed. 
The  Compare  situation  relies  heavily  on  prior  knowledge  that  individuals 
have  about  relations.  Most  frequendy,  the  Compare  situadcm  requites  the 
solver  to  choose  either  the  larger  or  smaller  of  two  values  wh«i  d^ 
opetadve  reladon  is  stated  as  a  comparadve  adjective  or  adverb  (e.g., 
faster,  cheaper,  shorter,  more  cpiickly).  The  objective  is  the  determination 
of  whether  one's  respcmse  should  be  the  larger  or  the  smaller  of  the  known 
values.  This  siturricm  most  typically  occurs  as  the  final  part  of  a  multi-step 
item.  For  instance,  one  often  sees  problems  in  which  die  solver  is  expected 
to  decide  after  several  problem-scdving  steps  which  of  two  items  offered  for 
sale  is  the  better  boy.  This  final  d^ermi^cm  is  a  Compare.  It  requires 
only  the  recogniticm  oS  tritidi  oi  the  two  items  is  less  cosdy-it  does  not 
requite  the  computaticm  of  bow  mudi  less.^  Most  Compare  items  involve 
values  for  cmly  two  objects,  aldiougb  it  is  certaiidy  possiMe  to  make 
comiwlsons  among  duee  or  more. 


3  It  should  be  noted  that  the  Compare  skuation  defined  Imre  differs  fitmi  the 
semantic  relation  of  the  same  name  developed  by  Riley,  Oteeno,  and  Hefier 
(1983). 


7 


Hie  Restate  situatiaa  contains  a  q)edfic  relatiCMiship  between  two 
(fifferent  tilings  at  a  given  pdnt  in  time.  Tbe  rdatitHiship  exists  cndy  for  tiie 
particular  time  frame  of  the  story  and  cannot  be  genersdized  to  a  broader 
context  There  ate  two  determining  features  oS  a  Restate  situation.  First 
the  two  things  must  be  linfced  a  rdational  statement  (e.g.,oiie  of  tiiem  is 
twice  as  great  as,  tiiiee  more  than,  or  (me  half  of  tiie  size  of  the 
otiier).Seooiid,  the  lelaticm^p  must  be  true  for  botii  tiie  original  verbal 
desciipti<ms  of  tiie  two  things  and  tiie  numerical  values  associated  with 
them.  Thus,  ifMaryisnowtwi(%ascddas  Alice,then20years-whichis 
Mary's  age-must  be  twice  as  great  as  10  jears,  which  is  Alice's  age.  Note 
that  this  relationship  was  not  true  (me  year  ago  nor  will  it  necessarily  be  true 
in  five  years. 

Ibe  Vary  situation  is  diaracterized  by  a  fixed  relationship  between  two 
titings  that  persists  over  time.  Die  two  things  may  be  two  different  objects 
(e.g.,  boys  and  girls)  sudi  that  one  can  describe  a  ratio  as  "for  every  boy 
who  could  perform  x,  there  were  2  girls  who  could  do  the  same ....”,  or  they 
may  be  one  (ibject  and  a  measurable  attribute  of  it  (e.g.,  iq^es  and  tiieir 
cost)  with  the  {xxiblan  having  the  form  "if  one  apple  cost  $.50  then  five 
ap(des  ....".  An  essential  feature  of  the  Vary  situatirm  is  tiie  unchanging 
nature  of  the  relationship.  If  (me  of  the  (ibjects  is  varied,  the  amount  of  the 
second  changes  systematically  as  a  frmcti(m  of  the  knovin  relationship.  Ihe 
variation  may  be  direct  or  incfirecL 

Sinqile  examples  of  these  five  situations  ate  given  in  Tidile  1.  During  the 
entire  course  of  computer  instnmtion,  eadi  of  tiie  situations  is  introduced, 
eiqgained,  and  transformed  to  a  ptoUem  setting.  Eventuidly,  several  are 
lirdced  together  to  form  multi-stq)  proMems.  In  the  introductory  lesson, 
each  situatkm  is  described  by  means  of  an  exanqple  and  with  the  general 
fettures  (Miidi  define  it 

Although  they  are  very  simple  and  readily  understandatde,  the  five 
tituations  ate  not  intuitivdy  known  by  students  tiitough  previous 
instruction.  Expetimems  with  groi^  from  several  differeiti  student 
popul^tms  indicated  that  students  (and  teachers)  do  not  typically  recognize 
or  use  situational  knowledge  in  story  proUems  (Marshall,  1991).  Those 
same  experiments  show  that  studertis  of  all  ages  are  nevetthdess  able  to 
learn  them. 

The  present  study  was  designed  to  investigate  bow  that  Iraming  conies 
abom.  Becmise  Qiey  were  pceviondy  unknown  to  the  students,  the 
stimtioas  in  story  proHems  were.  In  feci,  five  new  concepts  to  be  levned. 
Thus,  the  study  described  here  provides  a  settir^  for  investigatii^  how 


8 


’nibtel 

Hw  Five  SitnatiiMis 


CHANGE  To  print  his  computer  job,  JefiErey  needed  spedal  paper.  He 
loaded  300  sheets  of  paper  into  the  paper  Inn  tte  laser 
printer  and  ran  his  jc^.  When  he  was  done,  there  were  35 
sheets  of  paper  left. 

GROUP  The  Psychology  Department  has  a  large  faculty:  17  Pro^'^sors, 

9  Associate  Professws,  and  16  Assistant  Professors. 


COMPARE  The  best  typist  in  the  pool  can  type  65  words  per  minute  on  the 
typewriter  and  80  words  per  minute  on  the  word  {H-ocessor. 

RESTATE  In  our  office,  the  new  copier  produces  copies  2.5  times  faster 
than  the  old  copier.  The  old  copier  produced  50  pages  every 
minute. 


VARY  An  editor  of  a  presti^ous  journal  noticed  that,  for  a  particularly 

wordy  author,  there  were  five  reference  citations  for  every  page 
of  text  There  were  35  text  pages  in  the  manuscript 


individuals  learn  new  concepts  that  have  obvious  ties  to  much  of  their 
previous  knowledge. 

The  Nature  of  Instruction 

To  model  successfully  the  acquisition  of  knowledge  fix)m  instruction,  tme 
must  examine  the  nature  of  that  instruction  and  the  type  of  information 
omtained  in  it  Generally,  there  are  two  ways  to  present  new  concq>ts  to 
students.  The  instructor  can  introduce  die  name  of  the  conceit  and  give  a 
prototyidc  example.  The  example  amtains  specific  details  and  is  couched 
in  a  setting  that  should  be  well-understood  by  students.  An  alternative 
approach  is  for  the  instructor  to  provide  the  name  of  the  concept  and  give  a 
general  description  of  its  most  important  features.  This  information  is 
abstract  and  contains  basic  characteristics  diat  should  qiidy  to  all  postiUe 
instances  of  the  owoqit  In  practice,  instructors  tyfdcally  do  bodt  They 
introduce  a  new  concept  by  name,  give  a  rqpresentative  case  in  wfaidi  tte 


9 


concept  clearly  occurs,  and  then  make  a  broad  statement  about  the  concept, 
which  is  intended  to  help  the  learner  generalize  the  concept  fifom  the  given 
example  to  othor  potential  instances. 

Some  interesting  research  has  been  carried  out  to  determine  whetha- 
students  learn  differentially  under  different  instructional  conditions.  Usual 
studies  of  instructional  content  tend  to  contrast  one  fwm  of  information 
with  another,  such  that  each  student  sees  only  one  type.  An  example  of  this 
type  of  research  is  found  in  Swelter’s  (1988)  comparison  of  {xoblem-solving 
p^ormance  following  rule-based  or  example-based  instruction. 

The  issue  I  address  is  different:  Given  access  to  typical  instruction  in 
which  both  specific  information  (i.e.,  examples)  and  abstract  information 
(i.e.  definitions)  are  available,  which  will  a  student  remember?  Do 
students  commit  equal  amounts  of  specific  and  abstract  knowledge  to 
memory?  Is  one  type  necessarily  encod^  first,  to  be  followed  by  the  other? 
Are  there  large  individual  differences?  If  so,  are  these  differences  related  to 
performance?  The  following  experiment  provides  some  initio  answers  to 
these  questions. 


THE  EXPERIMENT 
Subjects 

Subjects  were  27  college  students  with  relatively  weak  i»oblem-solving 
skills.  They  were  recruited  from  introduacxy  psychology  classes.  On  a 
;»etest  of  ten  multi-step  arithmetic  word  problems,  they  averaged  six  correct 
answers. 

Procedure 

Each  student  woriced  independently  on  a  Xerox  1186  Artificial 
Intelligence  Workstation.  All  instruction  and  exercises  were  displayed  on 
the  monitor,  and  the  student  responded  using  a  three-button  optic^  mouse. 
Each  student  participated  in  five  sessions,  with  eadi  session  comprised  of 
computer  instruction,  computer  exercises,  and  a  brief  interview.  Students 
spent  approximately  45-50  minutes  woridng  with  the  computer  in  each 
session  and  talked  with  the  experimenter  fot  about  5-10  minutes  in  the 
interviews.  As  stated  previously,  only  the  first  session-the  introduction  to 
die  five  situadons-is  of  interest  here. 


10 


Data  Ckrilection 


Data  were  collected  fiom  two  sources;  student  answers  to  the  first 
exercise  {uesented  by  the  computer  and  student  responses  to  the  interview 
questions.  Eadi  is  described  below. 

Identification  task.  The  first  source  of  data  was  the  computer  exercise 
that  followed  the  initial  instructional  sessioa  The  items  in  this  task 
resembled  those  of  Table  1.  They  were  selected  randomly  f(x  eadi  student 
fix>m  a  pool  of  100  items,  composed  of  20  of  each  type.  During  the 
exercise,  one  item  at  a  time  was  ^splayed,  and  the  student  responded  to  it 
by  selecting  the  name  of  one  situation  from  a  menu  containing  all  five 
names:  Change,  Group,  Connate,  Restate,  Vary.  The  student  received 
immediate  feedback  about  the  accuracy  of  the  answer,  and  if  the  student 
responded  inconectly,  the  correct  situation  was  identified. 

The  order  of  item  presentation  was  uniquely  determined  for  each  student 
Items  of  each  situation  type  remained  eligible  fm-  presentation  until  one  of 
two  criteria  was  Obtained:  Either  the  student  had  given  cmiect  responses 
for  2  instances  or  the  student  had  responded  incorrectly  to  4  of  them.  Thus, 
a  student  responded  to  at  least  2  items  of  each  type  and  to  no  mote  than  4  of 
them.  The  minimum  number  of  items  displayed  in  the  exercise  for  any 
student  was  10.  which  occurred  only  if  the  student  answered  each  of  them 
correctly.  The  maximum  number  that  amid  be  presented  was  20  items, 
which  could  h^^n  only  if  a  student  erred  in  identifying  the  first  two  items 
of  all  five  types.  The  number  of  items  presented  ranged  fix>m  10  to  18. 

Interview  Responses.  The  second  source  of  data  was  infmnation  given 
by  the  students  in  the  interviews.  The  interview  followed  immediately  after 
the  identification  task  described  above.  During  the  interview,  each  student 
was  asked  to  describe  the  situations  as  fully  as  he  or  she  could.  The  student 
was  asked  first  to  recall  the  names  of  file  situations  and  then  to  describe 
each  one  that  he  or  she  had  named.  After  each  of  the  student's  comments, 
the  experimenter  prompted  the  student  to  provide  additional  details  if 
possible.  All  interviews  were  audiotaped  and  transcribed. 

It  is  the  interview  data  that  reveal  which  pieces  of  instruction  were 
encoded  and  subsequently  retrieved  by  each  student  Certainly,  not  all  of 
the  new  knowledge  acquired  by  an  indivichial  will  be  revealed  in  an 
Interview.  It  is  expected  that  students  have  more  knowledge  than  they  can 
access  (as  pointed  out  by  Nisbett  ft  Wilscm,  1977).  Nevertheless,  the 
interview  data  ve  indicative  of  bow  file  indivitbial  has  organized  his  or  her 
knowledge  of  the  newly  acquired  concepts,  and  they  suggest  afiiich  jfieces 


11 


of  knowledge  are  most  salient  fa*  the  individual.  Following  well-known 
studies  such  as  Collins  and  Loflus  (1975)  or  Reder  and  Anderson  (1980), 
we  may  assume  that  individuals  will  tend  to  retrieve  the  most  closely 
associated  features  and  those  with  highest  salience  for  the  individual. 

Knowledge  Networics  and  O^nltive  Maps 

Data  horn  the  student  interviews  were  used  to  construct  knowledge 
networks,  one  for  each  student  Each  network  consists  of  a  s^  of  nodgs, 
representing  the  distinct  pieces  of  information  given  by  the  student  and 
links  connecting  the  nodes,  refsesenting  associations  between  the  pieces  of 
information. 

The  interviews  were  coded  in  the  following  way.  First  irrelevant 
comments  were  eliminated.  These  were  things  such  as  "Um.  let  me  think" 
or  "Fm  trying  to  remember ...."  Next  distinct  compcments  or  elements  of 
description  woe  identified.  These  were  usually  phrases  but  could  also  be 
single  words.  Ihese  became  tine  nodes  of  the  knowledge  netwtxks.  Two 
nodes  were  connected  in  a  network  if  die  student  linked  their  associated 
pieces  of  infbnnatitm  in  his  or  her  interview  response.  Two  research 
asristants  and  the  audior  coded  each  interview  widi  complete  agreement 

In  addititm  to  the  knowledge  network  for  eadi  student  an  "ideal” 
network  was  omstructed  6xm  the  instructiooal  text  As  with  the  students' 
networks,  nodes  were  created  to  rqitesem  each  distinct  piece  of 
information.  Two  sqparate  pieces  of  informatioo  appearing  ctmtiguously  in 
the  text  were  represented  by  two  nodes  with  a  link  between  them.  Needless 
to  say,  this  network  was  substantially  larger  than  any  student  network.  It 
rqxesents  all  that  a  studeik  could  possibly  encode  fiom  the  instruction,  and 
thus  it  serves  as  a  template  against  which  to  measure  the  amount  and  type  of 
information  encoded  by  eadi  studmiL  The  "ideal”  network  for  all  of  the 
situational  informatioo  is  presented  in  Rgure  1. 

Two  things  should  be  noted  about  the  network  presented  in  Figure  1. 
First,  distances  between  nodes  and  spatial  orientation  of  tiie  nodes  have  no 
meaning.  Only  the  presence  or  tixence  of  nodes  and  linioi  is  of  impotance. 
Second,  in  this  figure,  all  mdes  rppemr  equally  importam,  and  the  same  is 
true  for  the  links.  Strengtii  and  activation  are  not  shown.  However,  in 
theory  each  node  has  a  measwe  of  stteagtii  that  is  a  fonction  of  how  many 
times  it  ^ipears  in  ttie  instioctioo,  and  each  link  has  a  similar  measure  of 
activmion,  depewfing  upon  how  fieqpently  the  two  nodes  ate  linked. 


12 


Flflur#  1:  THE  'IDEAL*  NETWORK 


Hgure  1  rqxesents  the  ideal  case  in  whicb  all  information  is  included  in 
the  network.  Normally  students  do  not  retain  all  of  the  details,  and  the 
networks  one  amstracts  for  diero  appear  incomplete  when  compared  with 
the  ideal  situation.  Thus,  we  expect  the  student  netwo^  to  be 
considerably  ^»rser  than  that  shown  in  Figure  1. 

Several  types  of  information  may  be  gleaned  from  a  student's  knowledge 
network.  First,  of  course,  die  network  is  an  indicadcm  of  bow  much  die 
student  remembered.  The  number  of  nodes  in  a  network  provides  an 
estimate  of  this  informadon.  Secrmd,  the  network  shows  which  pieces  of 
informadon  are  related  for  an  individual.  A  measure  of  assodadrm  can  be 
made  by  counting  the  number  rtf  links  and  using  that  number  to  estimate  the 
degree  of  connectivity  of  die  entire  network.  Node  count  and  degree  of 
oranectivity  are  standmd  network  measures.  I  have  discussed  elsewhere 
how  diey  may  be  used  to  esdmate  a  studoif  s  knowledge  of  a  subject  area 
(Marshan,  1990). 


13 


In  this  ch^>ter  I  examine  two  additional  types  of  information:  (a) 
specificity,  whidi  is  the  students'  tmidency  to  recall  q)edfic  or  abstract 
features  to  describe  the  situatimis  and  (b)  confusions,  udiidi  show  the  extent 
to  ii^di  students  confused  different  aspects  of  the  five  situaticms.  One 
examines  nodes  to  estimate  the  forn^  and  links  to  estimate  file  latter. 

Spedfldty.  Each  node  in  the  "ideal”  network  reflects  one  two  types 
of  detail:  specific  or  abstract  Specific  knowledge  refers  to  dements  of 
information  having  to  do  with  the  examples  presented  in  instructicm,  and  it 
reflects  the  particular  details  of  the  exanqile.  Abstract  knowledge  refers  to 
the  general  features  or  definition  of  file  situation.  The  instruction  contains 
i^ijxoximately  an  equal  amount  of  both  types,  as  can  be  seen  in  Figure  1. 
Tlte  abstract  nodes  are  represented  by  filled  circles,  and  the  specific  ones  are 
indicated  by  hollow  circles.^ 

Each  distinct  piece  of  information  (i.e.,  each  node)  recalled  by  a  student 
was  categorized  as  being  specific  or  abstract.  A  response  was  considered  to 
be  specific  knowledge  if  it  pertained  to  a  qiedfic  example.  Typically, 
students  giving  this  sort  of  response  referred  to  details  from  the  initial 
exanqile  used  in  the  ccunputer  instruction.  An  illustratiCMi  is  given  in  the 
spedfic  response  of  Table  2.  The  italicized  ifiuases  are  examples  of  specific 
detail.  In  contrast,  a  response  was  ccmsidered  to  be  abstract  knowledge  if  it 
reflected  a  general  definition  or  characterization.  Table  2  also  contains  an 
illustration  of  an  obstraa  lespcmse,  and  the  italicized  phrases  indicate  the 
abstract  detail.  The  final  example  of  a  student  response  in  Table  2 
illustrates  the  case  in  which  neither  abstract  nor  specific  detail  is  recalled. 

Three  measures  of  q)ecificity  were  developed:  the  number  of  specific 
nspoases,  the  number  of  abstract  tespmises,  and  the  ratio  of  abstract  to 
^wcific  responses.  These  measures  were  used  in  the  statistical  analyses 
described  below. 

Confusions.  In  the  networks  rqxesenting  situational  knowledge,  two 
types  of  links  ate  possiUe,  intra'Situttional  and  inter-situational  links. 
Intra-situafimial  links  ate  judged  always  to  be  valuable.  That  is,  if  two 
nodes  are  both  associated  wifii  one  situation  and  they  are  connected  to  each 
other,  then  the  retrieval  of  one  of  file  nodes  ought  to  facilitate  the  retrieval 


4  U  should  be  noted  that  the  hutroctioe  was  not  devekqied  undor  the  constraint 
that  equal  abstract  and  q;iecific  det^  be  contained  in  it  The  guk&ig  princ^ 
was  to  explain  eadi  sitnatioo  as  conpietefy  as  possible,  using  tpedSc  and/or 
abstract  elements  as  nwded. 


14 


TMftl 

Eaunplet  of  Stndcst  Rcspoaact 


ABSTRACT 

Q: 

What  do  you  remember  about  Groiqi? 

A: 

Group  is  vriimi  you  have  different  items,  (Efferent 
ffoups  of  iums,ikelicsaibe  categorized  into  one 
geneniffoup. 

SPECIFIC: 

Q: 

What  about  Group? 

A: 

That  was  when  yon  bought  7  sturts  and  4  pairs  of 
shorts  and  they  grouped  it  into  dothing.  So  you  had  11 
separate  things  of  clothing. 

NONE: 

Q; 

Tell  me  about  Change. 

A: 

I  pressed  that  review  button  so  many  times  and  I  cant 
remember  anythii^  r^ht  now.  Um,  change  was,  um 
my  mind  is  blank  right  now.  I  did  dcay  on  the 
computer.  IVefwgotten  just  about  everything.  Tm 
trying  to  think  of  an  example.  I  know  th^  change 
something  and  make  something  else. 

of  tbe  other.  This  is  the  principle  of  qxeading  activation.  In  general,  the 
more  knowledge  the  individual  has  about  a  concept  and  the  greater  die 
number  of  associations  connecting  that  knowledge,  the  better  the  individud 
understands  it  Figure  2  shows  how  the  "ideal"  network  of  Hgure  1  can  be 
rqxesented  as  a  two-layer  map.  The  nodes  at  tbe  upper  level  are  the  five 
situadons,  and  those  at  die  lower  levd  are  the  knowledge  nodes  developed 
during  instruction.  Ctxmecdoas  amcmg  the  nodes  at  the  lower  layer 
rqveseid  intra-situad(»al  links.  Goierally,  a  larger  number  of  connections 
at  diis  level  indicates  greater  understanding  on  die  part  of  the  individual.  It 
is  these  connecdoos  diat  are  diown  as  wdl  in  die  network  of  Figure  1. 

In  contrast  iiKer-situadooal  links,  i.e.,  links  between  different  sltuatitms, 
may  or  may  not  be  of  value  to  the  indivithMl's  learning,  because  they  are  a 
pot^al  source  oi  oonfiision.  Such  links  will  not  always  r^ect 
oonflisions;  sitrmdoos  could  in  prindpie  diare  one  or  more  femures.  In  die 
present  case,  however,  die  instruction  was  carefidly  designed  to  diminate 
common  liBatuies  amoi^  situadons.  TUs  is  reflected  in  Hgtae  2  by  die 
oomecdon  final  each  node  at  the  lower  levei  to  a  single  node  at  die  tqiper 


IS 


Figure  2:  THE  MDEAL’  MAP 


Change 


Group 


Compare  Restate 


Vary 


levd.  Given  the  design  of  Instnictirai,  there  should  be  no  inter-situatioiid 
links.  That  is.  no  node  at  die  lower  level  should  connect  to  more  than  a 
single  upper  levd  node.  Such  linkages  would  be  ctmfusicm  links  and 
reflea  a  misunderstanding  about  the  two  situations  so  Unked. 

An  example  of  diflerraces  in  stmlents*  Inter-situaticMiai  and  intra- 
situadtMial  links  is  given  in  Rgute  3.  Two  student  maps  are  presented  in 
dds  figure.  Both  students  encoded  a  tdadvdy  large  amount  of  information 
from  the  insmictioo,  compared  widi  other  students  in  the  dqierimeiit,  but  it 
is  dear  fiom  the  figure  diat  fli^  recalled  diffetent  dements  of  inftxmalioa 
Student  S7  remembered  disdna  pieces  of  information  about  eadi  situditm 
and  showed  no  coofiisioos.  S22,  on  file  ofiier  hand,  expressed  a  number  of 
oooflisiofis,  wtiidi  are  rqxeseded  in  Rguie  3  by  the  dadied  links  between 
the  two  layers  of  nodes.  These  cognitive  laapa  are  characteristics  of 
inoonqfiete  mastery.  The  sitoatioaal  knowledge  of  every  studot  cmi  be 
described  by  sudi  a  map.  Obviously,  file  defidts  of  a  student  are  highly 
individoal.  These  individoai  dUHerenoes  will  be  discussed  Anther  in  a  later 
sectioa  ctf  fids  diapier. 

In  summary,  the  student  network  and  its  coneqxmding  map  provide 
inftnmatioa  dxwt  the  number  of  details  file  student  remenfiwed  about  a 
situatiop,  the  amount  of  oormectiviQf,  the  type  of  knowledge  ataiiaa 
orapedficX  aodtbeanniberofcoeflDaioneiafiiesmdeai'sicqxMtte.  The 
networks  md  the  measnres  deacitted  here  were  the  bases  for  fiieataiMiGal 


16 


Figure  3:  TWO  STUDENT  MAPS 


analyses  presented  below  and  also  served  as  iiqwt  to  the  simulaticHi  model, 
sdiicb  is  described  in  tbe  section  fcdlowing  the  statistical  analyses. 

STATISTICAL  ANALYSIS 

Three  questions  ate  addressed  by  the  statistical  evaluation.  The  first  is 
whedier  students  remember  diflbtem  amounts  of  detail  fixan  instiuctimi,  the 
second  is  vriiether  one  can  characterize  die  type  of  information  encoded  by  a 
student,  and  die  fiiird  is  whefiKT  fiKse  dilfotences  ate  related  to  tbe  students' 
success  <»  tbe  idendficatioo  task.  Evaluation  oi  die  student  networia 
riiows  that  some  ^udents  were  mone  likely  to  encode  mosdy  sptdSc 
details,  scane  were  more  likdy  to  encode  mosdy  abstract  infocmadoa,  some 
encoded  both  in  about  eqmd  proportoas,  and  some  encoded  almost  noddng. 
The  statisdcal  analysis  evaluaiBS  whedier  these  tendencies  are  tdaied  to 
performance  on  the  ideattllcMkai  task  and  whedier  the  tdadontfi^  can  be 
generalized  to  die  entire  grog)  of  students. 


17 


It  is  evident  fiom  die  interview  data  that  students  varied  greatly  in  die 
amount  of  informaritm  they  were  aUe  to  recall  about  the  five  situadons. 
The  number  of  different  details  retrieved  by  students  extended  fixun  a  low  of 
3  to  a  high  of  20.  Ibe  mean  number  of  details  was  13.5,  widi  a  standard 
deviadon  of  4.02. 

The  number  of  abstract  and  qiedfic  details  recalled  also  varied,  and  the 
rado  of  abstract  to  specific  detail  ranged  fimn  14:3  to  6:14.  Thus,  die 
answers  to  bodi  the  first  and  die  second  quesdons  are  afOrmadve:  There 
were  dear  differences  in  the  total  amount  of  infonnadon  recalled  as  well  as 
differences  in  the  amount  of  abstract  and  qiedfic  infonnadon. 

Two  analyses  provide  insight  into  the  importance  of  this  difference. 
First,  on  the  basis  of  dieir  interview  reqxmses,  students  could  be  divided 
into  three  groups:  Abstract,  Speapc,  and  Both.  Students  dassified  as 
Abstract  gave  predominandy  definidcmal  responses  in  the  interview.  Those 
dassified  as  Spec^  used  mosdy  example  infonnadmi  frmn  the  ounputer 
instrucdon  to  describe  the  rituad(»s.  Those  dassified  as  Both  respcmded 
with  ^iproximately  equal  numbers  of  abstrad  and  specific  detail.  For 
membership  in  either  die  Abstract  or  Spec^  group,  students  had  to  have 
given  at  least  9  different  {deces  ctf  infonnadon  duri^  die  interview  widi  at 
least  twice  as  many  instances  of  one  type  of  information  as  the  other. 
Approximately  equal  numbers  of  studems  could  be  dassified  as  Abstract  or 
Specific,  with  6  in  die  former  and  7  in  die  latter.  An  additional  11  students 
were  categorized  as  Both.  These  students  gave  at  least  9  responses  with 
^iproximately  equal  mimbers  of  distract  and  tqiedfic  details. 

Figure  4  shows  the  relative  performance  on  the  identification  task  of  the 
three  groups  described  above.  A  one-way  analysis  of  variance,  widi  a 
dependent  measure  of  correct  reqxmses  to  the  identificttion  task,  ^  indicafes 
that  the  groups  differed  significandy  in  their  ability  to  recognize  the 
situations,  F{2,  21)  «  4.53,  p  <  .025.^  As  can  be  seen  in  Figure  4, 


^  It  win  be  recalled  that  students  viewed  rfiSnriag  numbers  of  henu  on  this 
emrdse.  For  purposes  of  compuison  in  tins  analysis,  only  the  fint  two 
esBmplars  of  eaoi  type  of  sknation  were  scored.  Thus,  ead  studeiU  received  a 
score  from  0-10. 

^  Complete  data  were  not  recorded  for  two  students.  One  loss  was  the  result  of 
comptter  failure  and  tte  second  was  the  result  of  a  maUunction  m  die  recor^ng 
of  die  mterview.  These  two  students  were  excluded  fioas  the  an^yees  r^orted 
here.  Two  other  students  hnvhw  o^  6  and  3  inferview  reqwnaes  leqiMtiv^ 
were  also  exdnded  from  thts  anaQuiB. 


18 


Figure  4:  GROUP  PERFORMANCE 


n 


t  -• 
1  -• 


ExMinpt0  Both  Abstract 

GROUP 


students  who  responded  primarily  with  abstraa  characterizations  of  the  five 
concepts  were  most  successful,  followed  by  fiiose  adx).  used  both  types  of 
information.  The  group  relying  on  examples  only  were  less  successful  than 
those  using  abstract  only  or  sA)Stract  knowledge  in  conjunction  with  specific 
details.  The  performance  of  the  abstract  group  was  significantly  higher 
than  the  performance  of  the  example  group,  t  (21)=  3.0QS,  p  <  .01. 

The  above  analysis  shows  that  differences  in  student  performance  can  be 
exfdained  in  terms  of  whether  a  student  remembered  abstract  or  q)ecific 
information.  One  also  expects  that  the  absolute  number  of  details  that  a 
student  remembers-^gardless  of  ahetber  they  are  definition  or  example- 
would  be  a  good  predictor  of  performance.  Surprisingly,  this  is  not  the 
case.  The  Pearson  product  moment  correlation  between  the  number  of 
correct  responses  on  die  perfiarmance  test  and  die  total  number  of  nodes 
encoded  fixm  the  student's  interview  is  .074,  accounting  for  less  than  1%  of 
the  variance. 

A  second  and  more  informative  way  of  anaiyzing  die  data  is  a  multiple 
regressimi  analysis  basedoathetypeandainountofinfi]rmation,tbeinter* 
sittMtonalcoofosioiis,  and  the  Interaction  between  die  two.  In  this  analysis, 
the  predictors  are  (1)  Xj,  the  ratio  of  abstract  to  qpedfic  detail,  (2) 
number  of  confiisions  matiooed  expUddy  by  die  student,  and  (3)  Xj ,  a 
product  variable  of  die  first  two  ptediciors.  The  dependent  measure,  agrin. 


19 


is  ibe  10-item  identificaticm  task.  The  resulting  prediction  equaticm  was: 
r  =  6.667  +  .6023f/  +  .545^2  -  .617X5*  with  all  coefficients  reaching  the 
conventional  .05  level  of  significance.  The  model  accounted  for  43%  of  the 
variance  and  was  statistically  significant,  =  0.43;  F(3,  21)  =  5.38,  p  < 
.01. 

In  general,  students  with  higher  abstract  to  specific  ratios  petfcamed 
better  on  the  identification  task  and  made  fewer  confusion  errors.  Students 
wifii  low  ratios  (i.e.,  those  with  more  specific  answos)  named  relatively  few 
confusions  but  also  responded  with  fewer  correct  answers.  Students  with 
^iproximately  the  same  number  of  specific  and  abstract  responses  had  the 
greatest  number  of  stated  confusions. 

Thus,  the  statistical  analyses  suggest  several  group  characteristics  with 
respect  to  learning  new  concepts.  That  is,  fiiere  are  temtendes  of  response 
that  apply  over  many  individuals,  not  just  a  single  one.  These  analyses  are 
based  on  summaries  of  the  cognitive  m^  and  aggregate  responses  to  the 
identification  task.  A  more  detailed  investigation  of  individuds'  responses 
provide  additional  information  about  the  nature  of  learning  in  this  study. 

THE  COGNITIVE  MODEL 

A  more  exacting  analysis  of  the  relationship  between  each  student's 
cognitive  map  and  his  or  her  responses  to  the  identification  task  was  carried 
out  by  simulafing  the  le^xmses  using  a  simple  feed-lateral  ccmnectionist 
model.  The  modd  simuldes  for  each  student  his  or  her  re^xmse  to  each 
item  of  the  ideittificatitm  task  did  the  student  actually  attempt^  to  identily. 

The  general  modd  is  given  in  Rgure  5.  It  has  diree  types  of  units: 
iiqwts,  student  nodes,  and  ou^wts.  Iqiuts  to  die  model  are  coded 
rqxesentatioiis  oS  the  problems,  and  outputs  are  the  names  of  the  situations. 
As  in  most  oouoectloiiist  modds,  acdvadon  qxeads  from  the  input  units  at 
die  lowest  levd  to  ttuse  ttf  the  intermediate  ]evd(s)  duough  their 
counectkms.  At  the  mkkQe  levd.  acdvadon  spreads  laterally  fitun  die 
nodes  direcdy  acdvated  by  die  lower  levd  units  to  other  nodes  at  the  same 
levd  with  adiich  diey  are  linked  (this  is  rqxesented  I^  die  two  middle 
hqers  in  Rgure  5).  Ritfly,  die  total  acdvadon  coming  into  eadi  unit  d 
the  ouqntt  levd  is  evduaied,  mid  the  output  unit  with  the  highest  acdvadon 
is  the  modd  reqxmse.  UnUke  muy  ooonectkMdst  modds,  the  unim  at  die 
mkhDe  layer,  dieir  oonnecdoiis  whh  other  nodes  at  dds  levd,  and  dieir 
Ullages  to  die  iqiper  levd  are  determined  aiqiUddy  firom  enqritica!  data. 


20 


Tabtea 

Item  Characteristics  Used  to  Encode  Story  Sitnations 


General  Chanuteristui: 

Set  modification 

Permanent  alteration 

Class  indusitm  (explicit  or  implicit) 

Relation  between  two  objects 

Relation  between  an  object  and  a  inoperty  of  that  object 
Fixed  relation  (implied) 

Relative  size 
Size  differential 
Percentage 
Cansafity 
Multiple  agents 
Multiple  objects 
Unit  measurement 
Two  identical  relations 

Kty  Plums: 

Each/every/per 
As  many  as 
Have  left 

Altogether /A  total  of 

More/less 

Cost 

Same 

If-Then 

Monqr 

TimsPsutum: 

Spec^  time  elements  (mimites,  days,  weeks) 
Before/after 


Tbc  btrftom  Utyo’  of  inlfL  Hk  Inpids  consist  of  infonnatitm  about  the 
items  timt  conqxlse  the  hleatifleatioo  task,  llicfe  are  27  possible 
diacacieristics  that  cm  be  present  in  any  item.  The  set  of  chmactoistics  is 
given  in  Table  3.  Eadi  item  is  coded  acconBng  to  these  diaracteristics  as  a 
27<«iement  vector  coniiiidnt  0%  and  Ih,  with  1  indieming  the  presence  a 


characteristic  and  0  its  absence.  Not  all  characteristics  will  be  {xesent  in 
any  single  item;  usually  a  simple  situation  requires  only  a  few  of  them.  The 
mean  number  of  characteristics  for  the  100  items  used  in  the  identification 
task  was  4.33.  All  100  items  were  encoded  by  three  ratacs  with  complete 
agreement 

The  middle  layers  ot  units.  For  eadr  student  the  middle  layers  of  the 
model  contains  a  set  of  nodes  and  the  connections  between  them.  The  two 
layers  have  identical  sets.  The  nodes  and  links  were  identified  from  the 
student  interviews,  as  described  previously,  and  they  fcnmed  the  basis  of  the 
statistical  analyses  of  the  preceding  section.  Three  trained  individuals  read 
the  transcript  of  each  individual's  interview  and  determined  which  nodes 
were  present  and  whether  they  were  linked.  As  in  the  characteristics  coding 
above,  the  three  coders  were  in  complete  agreement. 

The  top  layer  of  units.  The  outputs  for  the  model  are  the  five  situation 
names:  Change,  Group,  Compare,  Restate,  and  Vary.  Only  one  output  is 
{Hxxluced  for  a  given  input  vectcs'.  The  five  possible  outputs  compete,  and 
the  one  with  the  highest  accumulated  activation  wins. 

Connections  between  the  bottom  and  middle  layers.  Each  input 
element  may  connect  directly  to  one  cm*  more  of  the  nodes  contained  in  the 
student's  network  (represented  by  the  middle  layers  of  nodes).  Two  layers 
are  needed  in  this  model  to  illustrate  the  feed-lateral  aspect  The  lower  of 
the  node  layers  connects  to  the  input  units.  The  second  layer  illustrates  how 
the  nodes  connect  with  eadi  other.  Each  node  fiom  the  lower  node  set 
cormects  to  itself  aixl  to  any  other  iKxles  to  which  it  is  linked,  as  determined 
fitnn  Figure  2.  Thus,  activation  sfneads  fiom  the  injait  units  to  the  lower 
iKXle  layer.  Each  node  transfers  its  own  activation  to  the  next  layer  and  also 
spreads  additional  activation  to  any  other  nodes  to  which  it  is  connected. 
'Otis  particular  two-layer  representation  of  a  feecOateral  netwcxrk  preserves 
the  usual  constraint  that  activation  qxeads  upward  through  the  model. 

Some  of  the  iqwt  elements  (i.e.,  those  units  rq)resented  at  the  very 
bottmn  of  Figure  5)  may  activate  many  nodes  in  the  netwcxk,  some  may 
activate  otdy  a  few,  and  some  may  fail  to  make  a  connection  (if  the  student 
lacks  critical  ncxles).  The  allowaUe  linkages  between  the  iiqxit  and  middle 
layers  of  units  woe  d^ermined  by  nu^^ng  the  input  characteristics  to  the 
"ideal”  map  of  the  entire  instruction.  Recall  that  the  irqHit  characteristics  are 
general  features.  Most  of  them  activate  multiple  nodes,  and  these  nodes  are 
firequently  associated  with  different  situations.  Thus,  it  is  tare  fiiat  (me  ii^t 
diaracteristic  pdnts  to  a  single  sitodion.  The  ftill  pattern  of  possible 
activation  is  shown  in  Figure  5.  Note  that  this  figure  illustrates  all 


diaracteristics  as  fliey  link  to  all  nodes  and  is  thus  a  theoretical  pattern.  Ihe 
modd  would  never  be  presented  with  a  problem  containing  all  possible 
features,  nor  did  any  student  have  all  posable  nodes  at  die  middle  layos. 

Once  the  student  network  receives  the  input,  activation  qireads  firm  die 
nodes  direcdy  targeted  by  the  input  elements  through  any  links  diey  have  to 
otho-  nodes  at  this  level.  All  of  the  activated  nodes  diea  transmit  their  total 
acdvadon  to  the  units  at  the  vppet  level.  The  amount  of  activation  for  each 
situation  is  determined  firm  tte  accumulation  of  activated  links  leading  to 
it  TM"  five  situations  compete  with  eadi  other  for  die  highest  level  of 
activation,  and  the  one  widi  the  highest  value  becomes  die  output  Thus, 
die  model  of  Hgure  S  represents  the  ir^  of  an  item,  the  activation  of  the 
student's  semantic  network,  the  crmpetition  among  situations,  and  tte  final 
ou^t  as  a  result  of  total  activation  throughout  the  model. 

The  model  depends  upon  the  set  of  nodes  for  eadi  student  the  pattern  of 
linkages  among  them,  the  overall  assodatirm  of  subsets  of  nodes  with  the 
situatirm  labels,  and  the  irqiut  characteristics  of  the  items.  All  except  the 
latter  are  derived  fiom  the  student  cognitive  maps  described  earlier. 

Modd  Verification 

As  a  test  of  the  model's  adequacy,  a  simulation  was  carried  out  in  which 
the  ideal  network  of  Figure  2  was  u^  as  the  student  modd.  The  100  items 
available  in  the  identification  task  were  presented  to  flie  mtxlel,  and  its 
responses  were  ampared  witii  the  coiect  answers.  The  model  performed 
with  100%  accuracy,  successfully  identifying  the  situations  fiir  all  items. 

Simulation  Results 

A  simulation  of  each  student's  performance  on  the  identificaticm  task  was 
cntied  out  For  eadi  student,  tiie  response  to  the  first  item  encountered  in 
tile  exercise  was  simulated  flrd,  using  tiiat  item's  vector  of  diaracteristics 
and  the  student's  network  inftamation.  The  second  item  followed,  and  tiien 
ail  subsequertt  items  until  tiie  excsdse  terminated.  Thus,  the  simulation 
coveted  all  items  presented  to  the  student  in  the  order  in  vhidi  the  student 
sawtiiem. 

As  described  above,  the  number  of  items  answered  by  students  varied 
fiomlOtolS,  yiddingatotalt^SdOimmrespcHises.  A  conqiaiison  of  the 
resultB  of  the  dmuldioo  of  tiiese  360  nspoaaes  with  tiie  actual  student 
responses  to  them  is  given  in  ThMe  4. 


24 


Tia>k4 

Smnlatioo  Kesnlts 


Outomie 

Frequency 

Obser^  Outcomes 

Frequent^ 
Attested  Odcomes' 

CSM 

192 

192 

CSM 

64 

64 

CSM 

19 

13 

CS  M 

30 

13 

CMJ 

55 

51 

Total 

360 

333 

Key:  (1)  CSM 

(2)  C  SM 

(3)  C~S  M 

(4) CS5/ 

(5) 0?  5 


Both  model  and  student  answered  correctly. 
Model  and  student  made  the  same  error. 
Model  and  student  made  different  errors. 
Student  answered  correctly;  model  erred. 
Model  answered  correct!^  student  erred. 


(C  correct  respcmse;  S  =  student  response^  Af  »  model  response) 
*Impa6siUe  matdiet  esduded 


Table  4  piesents  tbe  observed  dassificabon  of  the  students'  lespcHises  as 
wdl  as  an  adjusted  dassificatiem  against  whidi  the  modd  was  ctuiqjaied. 
All  360  itons  ounprise  the  obser^  dassificatioa  In  die  adjusted 
dassifleation,  some  items  have  been  omitted  fimn  consideration  because  die 
modd  was  constrained  by  a  lack  of  infocmadtm  &t»n  tbe  student  interview. 
Hds  occuned  under  die  fidlowing  ctmdition:  If  a  studeid  was  unable  to 
remember  the  name  of  a  situadon  or  anything  (hat  described  it  in  die 
interview,  die  modd  fiar  that  sQ^ent  would  have  no  nodes  at  the  middle 
layer  that  could  liidt  to  (he  ^nadon  name.  Thus,  die  modd  would  be 
constrained  to  ignore  dnt  sitaadrm  and  would  never  generate  a  reqxmse 
pointii^toit  ConsequenOy,  if  a  student  omitted  endrdy  a  situation  in  the 
iitterview,  all  items  for  wUdi  the  student  gave  that  sididioo  as  a  reqxiise 
were  likewise  eiiminated.  There  were  27  of  diese  inqnssible  matches.  As 
shown  in  TaHc  4,  17  of  (beae  were  items  whidi  die  stndott  answered 
conecdy,md  10  were  items  on  whidi  tbe  studot  erred.  It  should  be  noted 


2S 


diat  these  ate  not  model  fiiOiiies  but  ate  interview  £ailutes. 

Eadi  qjplication  of  die  model  to  a  vector  of  item  diaractetistics, 
rqxesendng  a  single  item,  resulted  in  me  of  five  outcomes,  as  Shown  in 
Figure  S.  Outcmnes  CSM  and  CJhi  ate  exact,  successfid  simulafioos  of 
the  model.  In  both  cases,  die  model  generated  a  tespmse  that  was  identical 
to  die  one  induced  by  the  student  In  the  first  die  teqionse  was  correct 
and  in  the  secmd,  it  was  an  enor.  Ihe  outcome  CJS^is  considered  to  be 
a  partial  success  of  the  model.  Bodi  the  student  response  and  die  model 
tespmse  were  in  error,  but  diey  were  different  errors.  In  these  cases,  the 
model  accurately  predicted  that  the  student  lacked  critical  knowledge  and 
would  err. 

The  remaining  two  outcomes,  C5Jlf  and  CM_S,  represent  simulation 
failures.  The  most  serious  of  these  is  CS_M,  reflecting  cases  in  which  the 
student  answered  cotrecUy  but  the  model  f^ed  to  do  so.  They  are  serious 
failures  because  they  suggest  diat  the  model  did  not  capture  sufficiently  the 
student's  knowledge  about  the  situations.  It  should  be  noted  dial  mane  than 
half  of  the  observed  instances  of  CSJd  were  in^xis^ble  matches,  as 
described  pievioudy.  That  is,  the  student  omitted  any  discussion  of  the 
situation  in  the  imerview,  and  the  model  was  subsequ^y  constrained  to 
ignore  it  As  mentiooed  above,  these  instances  are  coosideied  to  be 
interview  dilutes  rather  than  model  failures.  Only  the  remaining  13 
instances  are  true  model  fiaHures,  representing  just  3.9%  all  responses. 

The  final  outcome  category,  CM_$,  also  rqxesents  model  &ilure  but  is 
less  oidcal  dian  the  failures  of  CSJii.  In  this  category,  the  model  made  a 
correct  response  when  die  student  did  not 

Many  (ff  die  CMJ  simolatioo  failutes  can  be  exi^ained  by  considering 
the  students'  oqieiienoe  as  diey  respond  to  the  idendficadon  task.  During 
the  actual  task,  many  students  made  enors  on  one  or  mote  situadons  and 
then  t^iparendy  learned  to  dassiiy  diese  same  situadons  conecdy.  This  is 
evidenced  by  their  patterns  of  leqxMises,  tyidcaily  an  inconect  response  to  a 
situadon  followed  by  two  conect  responses  to  die  same  situadon,  widi  no 
adcBdonal  errors.  Whit  has  happened  in  such  cases  is  diat  the  stmlenfs 
knov^edge  network  presumably  dunked  during  die  course  of  the  task.  The 
knowler^  base  diat  geneniied  die  early  Incorrect  le^ionses  is  not 
neoosvily  the  same  one  dMtfeneraled  the  later  soocessfiil  ones.  Aiid,itls 
only  die  latter  diat  is  reflected  in  the  student's  interview.  In  such  instances, 
die  inodd  would  otneedy  match  the  two  oorred  reqwnses.  but  it  would 
ate)  give  die  conect  lespoBse  to  die  first  item  that  die  stndoit  missed.  The 
model  does  not  lean).  It  simulates  the  state  of  die  nudeut  at  the  end  of  foe 


exerdse,  as  reflected  in  the  interview.  If  the  student  learned  during  the 
course  of  the  exercise,  we  have  no  way  (tf  knowing  what  node  configuratirHi 
corresponded  to  flie  earlier,  incorrect  responses.  Under  the  most 
conservative  criteticni  of  learning~an  error  followed  by  two  correct 
te^xHises~2S%  of  the  mismatdies  can  be  accounted  for  by  studoit  learning. 
In  each  case  the  modd  gave  the  correct  re^xmse  to  all  three  items. 

Another  2S%  of  the  mismatches  occurred  udren  bodi  the  model  and  the 
student  selected  different  wrong  situaticms  as  the  reqxmse  opticm.  In  these 
cases,  the  model  correctly  determined  fliat  the  student  would  rx)t  give  the 
correct  response.  The  model's  answers  may  differ  from  the  student's  for  a 
number  of  reasrms,  including  guessing.  These  were,  after  all,  multiide 
choice  exercises,  in  adiidr  students  were  asked  to  select  the  correct  situation 
fiom  the  menu  of  flve  possible  (»es.  Students  iHobd>ly  guessed  at  some  of 
the  answers,  but  the  model  does  not  guess. 

There  are  other  possiUe  explanations  for  the  model  failures.  On  the  one 
hand,  some  students  may  have  been  prone  to  "slip"  as  they  made  their 
selections  using  the  mouse,  resulting  in  the  unintentional  selection  of  the 
option  residing  eiftier  above  or  below  the  desired  one.  It  is  not  an 
uncommon  jAienoinaion,  as  those  who  use  a  mouse  frequently  can  attest 
Accidental  errors  of  this  sort  are  undetectalde.  Similarly,  students  may  have 
used  a  test-taking  strategy,  sudi  as  avr^ding  the  selection  of  (we  response  if 
ttiey  used  it  cm  flie  immediately  preceding  exerdse.  These  errors  are  also 
undetectable:  The  model  does  not  take  test-taking  strategies  into  account 

If  we  consider  dm  "probaMe  learning"  mismatches  (i.e.,  those  that  were 
foOowed  by  two  correct  matches  cm  the  same  situaticm)  and  die  "different 
error”  mismatches  (i.e.,  diose  in  wldch  the  nKxld  and  student  both  erred  but 
selected  different  errors)  as  understandable  or  explainable  disctepaixdes, 
the  total  number  ctf  mismatches  betwem  students  and  the  model  is  reduced 
fiom  77  to  51,  leaving  ody  13  CS_M  and  38  CM_$  as  mismatches.  Thus, 
die  model  sadsfactorily  accounts  for  85%  of  all  studem  reqxmses. 

A  final  evaiuadon  of  die  model's  perfocinance  comes  ficmi  examiniiig 
how  weU  indivichua  student  performance  was  simidated  by  die  model.  The 


^  Several  other  hutancei  exist  m  wUdi  the  student  maic  mnlt^  errors  cm  a 
sitBMion  and  then  responded  oonectly  to  one  final  instanoe  of  that  situttion. 
Whfie  it  is  very  plaaAie  that  femiag  also  occurred  in  diese  casn,  one  heskates 
to  draw  a  cooehahm  based  osdy  on  one  response.  Thus,  foese  errors  retnain 


27 


IWbItS 


Student 

Na  of  No.  of 

Items  ImposaUe* 
Matches 

Percent 

Matdies 

Percent 

*Rsplamed* 

Matdies 

Total 

Percent 

Matdies 

1 

13 

3 

100% 

0 

100% 

2 

15 

0 

80% 

7% 

87% 

3 

13 

3 

4 

13 

3 

10% 

5 

16 

0 

75% 

6% 

81% 

6 

11 

0 

100% 

0 

100% 

7 

14 

0 

86% 

7% 

93% 

8 

13 

0 

92% 

0 

92% 

9 

14 

5 

100% 

0 

100% 

10 

15 

2 

85% 

7% 

92% 

11 

15 

3 

67% 

8% 

73% 

12 

16 

0 

63% 

31% 

94% 

13 

14 

0 

71% 

0 

71% 

14 

13 

0 

69% 

8% 

77% 

15 

14 

3 

100% 

0. 

100% 

16 

13 

3 

90% 

10% 

100% 

17 

15 

0 

73% 

14% 

87% 

18 

14 

0 

79% 

7% 

86% 

19 

14 

0 

79% 

7% 

86% 

20 

18 

0 

67% 

7% 

72% 

21 

13 

0 

69% 

0 

dOOCKL 

WtO 

22 

16 

0 

69% 

12% 

81% 

23 

16 

0 

56% 

13% 

jsfuy 

WyO 

24 

16 

2 

57% 

7% 

64% 

25 

16 

0 

69% 

12% 

81% 

results  for  odi  student  gtowlaitop  mt  given  ia  TaMe  5.  Two  measures 
success  are  given  In  the  table.  Ibe  fint  is  die  number  of  exact 
etdu^ng  die  "inqxNSible"  ones.  Tbe  second  is  die  overall  percentage  of 
sadsfiKtoty  matches  liar  each  incHvidttal  and  is  given  in  tbe  extreme  light- 
haod  column  of  die  table.  lUs  percentage  is  based  on  die  number  oi 
sads&Ktocy  ■m****,^  Hie  "peobaUe  toam^''  and  "dUftreat  enor" 

mismatcfaei  described  shove  but  rlimiaating  ftom  comidention  die 


"fanpoeeibie*  nwlft—  Ascanbeaeeain'nbleS.dmpeifbnMaceof  6of 
die  2S  students  was  fit  exacfiy  by  die  model  widi  100%  agweaent  the 


oiodel  simulated  tbe  perfonnance  of  an  additional  12  students  with  accuracy 
between  80-99%.  The  modd's  success  rate  fell  bdow  70%  for  oidy  3 
students,  to  a  low  of  64%. 

DISCUSSION 

There  ate  several  in^xxtant  imidications  that  result  from  diis  study.  They 
are  discussed  bdow  widi  te^)^  to  the  three  questions  posed  in  the 
introductirm:  What  do  they  learn,  when  do  diey  learn  it,  and  uhat  can  they 
retrieve? 

What  specific  infinmation  does  a  student  learn  from  initial  instruction 
about  a  new  topic? 

One  of  the  most  striking  findings  was  that  students  tended  to  encode  and 
use  qtecific  details  fixxn  die  initial  examples  used  in  instructirm.  Almost  an 
of  die  exanqde  nodes  had  to  do  with  die  five  introductory  examples,  de^te 
die  fact  that  several  odierexanqjles  were  given  later  in  the  instrucdon.  (See, 
for  exanqile,  die  Specific  response  of  Table  2.)  This  finding  suggests  diat 
die  very  first  example  of  a  concept  is  highly  inqiartant  and  should, 
therefore,  be  carefiiDy  developed.  FOr  many  students,  the  initial  exanqiles 
provided  the  scaffioldirig  for  die  sernamic  networks.  Some  of  the  details  of 
those  examples  led  to  erroneous  cmmections.  As  a  case  in  point,  the 
exanqile  for  one  of  the  situations  was  based  on  money,  leading  some 
students  to  expect  Quconecdy)  this  situation  to  be  present  udienever  money 
was  in  die  problem.  These  fruity  connections  were  very  evident  in  dieir 
interview  responses. 

A  general  pattern  rtf  encoding  was  apparent  fiom  die  students'  reqioiises. 
Several  students  described  die  sitnatioos  only  in  terms  oi  the  examples. 
When  prmiqited,  diey  were  unable  to  embdlidi  their  descriptions  by  usiiig 
abstract  diaracierlMtioos.  No  instance  of  exanqile  information  followed  by 
abstiaa  information  was  observed.  In  contrast,  students  having  distract 
knowledge  alwi^  used  it  in  prefttence  to  giving  exanqile  details.  That  is, 
their  initial  responses  were  generalizations.  When  prompted  for  more 
information,  they  used  example  details  to  support  their  abstract  descriptfcms. 
These  findings  suggest  that  students  miqr  first  encode  die  exanqile 
information  and  diea  build  die  distract  networir  around  it  Once  formed,  die 
abstract  portion  dt  die  network  becomes  stroi^  iqion  exposure  to 
addltiond  exai^iles.  whereas  the  example  portion  does  not  augment  its 
activation  or  stiengdt  If  die  abstract  information  is  not  encoded,  die  details 


of  ttie  exaiiq;ile~v^)idi  received  high  sirengtb  initidly-^emaio  tbe  most 
idieiit  dements  of  die  netwoA. 

How  is  the  InformatkM  that  the  stndent  encoded  In  memory  related  to 
the  student's  performance? 

Ihe  statistical  analyses  suggest  diat  die  d^tee  to  ndddi  a  student  is  able 
to  use  ids  or  her  abstiact  infomuttioo  is  poddvdy  tdated  to  die  student's 
success  on  the  idendficadoo  task.  Ihose  aide  to  express  mainly  abstract 
knowledge  apperendy  bad  die  best  understanding  of  die  five  omcqits  and 
were  most  easily  able  to  idemiiy  diem.  Those  for  whom  tbe  abstiact 
duncterizadons  were  someirfiat  inamqilete  (e.g..  those  v/bo  were  aUe  to 
give  abstiact  description  for  some  concqits  but  needed  exaiqple  details  to 
describe  otbeis)  peifonned  tess  wdl  but  sdll  were  more  successful  dian 
those  who  predominantly  rdted  on  exanyle  details. 

The  primanr  impUcadon  of  diis  findhig  is  diat  instnicdon  dioold  be 
developed  to  focUitaie  die  llidcnge  of  nbstiact  knowledge  to  easily 
understood  exami^  knondedge.  The  exanqiles  were  undoubtedly  salient 
and  easily  encoded.  For  some  students,  the  abstract  duncterizadons  were 
equally  easy  to  encode,  but  dds  was  not  univeisany  tree. 

Docs  die  cognldvc  model  reflect  this  rdadonsliip? 

The  oonnecdooist  modd  is  a  useftd  way  to  examine  indivithial 
perfbrmanoe  of  students  as  diey  idendfied  these  ooncqits.  The  simuladmi 
of  individual  perfbrmanoe  wm  extremdy  sucoessftil.  The  high  levd  of 
agreement  between  modd  perfbraumoe  and  stndent  performance  mggests 
that  the  modd  captuiea  most  of  the  salient  and  discriminating  infonnadon 
actually  used  by  the  students.  Most  importaot,  the  modd  demoostcates  die 
Impact  of  misidng  nodes  and  emmeoady  Utdoed  pairs  of  nodes.  Inmany 
cases,  knowledge  of  wMdi  nodes  were  missing  led  to  accurate  predfodoos 
of  snl)|ectt*  enoneous  reyonses.  In  others,  Inconect  liidcages  among  nodes 
also  led  to  aocunie  preilcdoos  of  enon.  The  modd  and  its  dnmladon 
provides  stroug  sup^  for  the  use  of  oognidve  netwoiks  to  iqxesent 
leaning  rf  concqits. 


REFERENCES 


CoUins,  A.  M  &  Lofiiis.  E.  F.  (1975).  A  spreading  activation  tbeoiy  of 
semantic  processing.  Psych(AogicalRevUw,82,4Xfl-A7!i. 

Macdiall.  S.  P.  (1990).  Sdectii^  good  diagnostic  items.  In  N.  Frederiksm, 
R.  Glaser.  A.  Lesgcdd,  &.  M.  9iafto  (Eds.)  Diagnostic  monitoring  cfskUl 
and  knowledge  acquisition  (pp.  433-452).  Hillsdale,  NJ:  Lawrence 
Eilbaum  Associates. 

Marshall.  S.  P..  Barthuli,  K.  E..  Brewer.  M.  A.,  &  Rose,  F.  E.  (1989). 
STORY  PROBLEM  SOLVER:  A  sdiema-based  system  of  instruction. 
Technical  R^nrt  89-01  (ONR  Contract  N00014-85-K-0661).  San  Diego: 
San  Diego  State  University.  Center  for  Researdt  in  Mathematics  and 
Sdence  Educatirm. 

Matdiall,  S.  P.  (1991).  Understanding  tiie  situations  of  arithmetic  story 
problems:  A  basis  for  schema  knowledge.  Uqxiltiisbed  manuscript. 

NidKtt,  R.  R  &  Mlson.  T.  D.  Tdling  more  than  we  can  know:  Verhal 
rqxxts  on  mental  processes.  Psydutiogical  Review,  84,  231-259. 

Riley.  M.  S.,  Greem,  J.  G..  &  Hdler,  J.  (1983).  Devdopment  of  diildrea's 
problem-solving  ability  in  arititmetic.  In  H.  Ginsburg  (Ed.),  The 
development  of  mathemtuical  thinking  (pp.  153-196).  New  York: 
Academic  Press. 

Tannizi,  R.  A,  &  SweOer,  J.  (1988).  Guidance  during  matirematied 

problem  sdving.  Journal  of  Educational  Psydwlogy,  80  pp.  424-436. 


31 


Problem-Solving  Schemas:  Hybrid  Models  of  Cognitioa 


Sandra  P.  ManhaU 
San  Diego  State  UoivcfSity 

San  Diego,  CA 

JobnP.  Mttsbail 
Crystal  Gnfsbica,  Inc. 

SaittaCuin.CA 

An  ongoing  oontiovetsy  In  the  cognitive  science  community  centers  on  the 
nature  of  die  models  used  to  iqxesem  cognitive  phenomena.  The  two  primary 
oontenden  are  pcoducdoo-systiem  models  (such  as  ACT*  and  SOAR)  and 
ooonecdooist  models  (sodt  as  diose  produced  by  McCIdland  and  Rumelhart  or 
Graesberg  and  Us  associates).  Ctttics  of  both  sides  argue  diat  the  other  cannot 
suffice  to  capture  human  behavior.  Both  appear  to  be  li^  Periups  what  is  needed 
is  a  modd  that  combines  the  best-and  lessens  die  wotst-fhatures  of  bodi  kinds  of 
modd.  A  hybrid  modd  having  these  characteristics  is  die  topic  of  this  n^oct 

The  lecognidoa  that  hybrid  modds  ace  needed  is  not  new.  A  number  of 
prominent  reseacchers  (ftom  both  sides  of  the  argnmmit)  have  suggested  that  srane 
unkmttfthetworqxeaentatiODSisinorder.  Fottxmffe,  in  The  Computer  and  the 
Mbid,  FUUp  Johnson-Laird  hypothesized  that  one  way  to  get  around  some  of  the 
(filemmas  posed  by  exiting  modds  of  cognitioa  was  to  "postulate  (Bfltaent  levds 
of  representation:  Ugb-levd  explicit  symbols  and  low-levd  dtetributed  symbolic 
pattern*  (p.  192).  The  opinloo  of  a  long-time  connectioaist  is  reflected  in  die  tide 
of  a  recent  article:  "HybiM  Computation  in  Cognitive  Sdenoe:  Neural  Networks 
and  SyndiUs*  (J.  A.  AndersoOr  1990).  And,  Marvin  Mlnslgr  echoes  the  sentiment 
inite  1991  paper,  "Logicd  Versus  Anslogicd  or  Synibolic  Versus  Connectionist  or 
Neat  Versus  Scrafly,*  odiere  be  states  csqdicitty  that  we  "need  integrated  systems 
that  cw  mqiloit  the  advantages  of  botii*  37). 

The  need  to  condtine  die  two  iqaesentations  derives  ftom  the  fiKt  tiiat 
ndtber  stone  has  been  eidifdy  satisfaraory  In  modding  complex  cogUtirai. 
Symbolic  production  systems,  as  dm  oUett  nd  most  widdy  used  of  the  two,  have 
been  v«ry  sooceasftd  in  descriUng  aoiae  lasportant  aspects  of  rale-based  problem 
solvli^  They  have  been  widdy  amd  in  actifldal  UKdigence  nd  have  gieady 
Influr  fed  die  devdopment  of  intdligeat  tuioring  systesBS  (Winger,  1987). 

AC  the  tsoM  tfaae,  anch  lyiws  are  aosed  Jbr  theft  inflcdbiUty  on  some 
relatively  sMpie  taitx,  sacn  as  opfcci  lecogniiioo  aou  cisssiDcaoon.  oereiter 
(1991)  provides  a  good  dtocussioa  of  some  of  die  central  ptoUems  with  nde-based 


33 


cognitive  models.  As  he  (and  ottiers)  point  out,  humans  are  not  particulaily  good  at 
woifciog  out  extended  k^cal  sequences.  Hiey  make  mistakes.  A  pioductitm 
system  does  not  make  mistakes,  and  one  (fifficulty  in  working  widi  ptoducticm 
models  is  to  produce  human-like  enofs  from  the  model.  Many  models  omsistently 
have  better  performance  fosm  the  hummis  they  are  intended  to  mimic. 

A  seccmd  difficulty  lies  in  foe  way  foat  production  systems  woric,  namely 
systematically,  orderly,  and  effidendy.  People  don't  seem  to  have  foose 
characteristics.  We  see  this  dUfioilty  nfoen  we  tty  tt>  naodel  complex  ptoUem 
solving,  using  protocols  generated  by  experts  or  by  novices.  Very  few  individuals 
stm  at  foe  beginning  of  a  problem  and  proceed  catefiilly  through  a  top-down 
process  to  reach  the  soludraL  To  modd  foeir  performance,  we  all  too  often  are 
forced  to  disregard  some  of  the  ptotocd  material  in  our  quest  for  sequential  rule- 
based  performance.  Moreover,  many  individuals  simply  cannot  articul^  whd  foey 
are  doing  or  explain  why  one  part  ot  a  proUem  triggers  a  particular  reqxmse  ftran 
foem,  sfoidt  suggests  foat  foeir  activity  is  not  entirdy  a  neat  and  orderly  process. 

Nonefodess,  there  are  dearly  many  instances  in  n4iidi  individuals  do 
engage  in  rale-based  cognitioo,  and  ptoductfon  systems  have  to  dde  provided  our 
best  means  dT  modding  foem.  is  partlcalatly  apparent  in  wdl-q;>ecified 
domains  from  mathematics  snch  as  arifometic,  algebra,  or  probability,  and  in  areas 
physics  snch  as  dectiicity  and  magnetks.  What  is  common  in  foese  domains  is 
foat  there  are  highly  spedllc  rdes  drat  need  to  be  acquired  and  applied  by 
inifividnals  in  onkr  to  operate  snccesaftilly  in  foe  domain.  As  a  sinqAe  example, 
consider  atithmetic  operations.  It  wodd  be  foe  tare  person  who  performed 
multtpUcatioo  or  kng  diviskm  without  resorting  to  foe  use  of  a  standard  algorithm. 
Modding  the  argaisitioo  md  use  o(  such  algorithms  are  piecisdy  the  areas  in  which 
productioo  modds  excd. 

On  foe  ofoer  hand,  coaoeeHoeist  modds  are  weak  in  just  foese  areas. 
Coonecdooist  modds  excd  in  pattern  recognition  rafoer  foan  in  logical  sequences  of 
actions.  UaUke  ptodactlQn  qratems,  ooimectloiiist  modds  do  not  dqrend  iqxm  foe 
fUng  of  independent  uatts  anch  as  ndes.  Rather,  a  collection  of  units  (nodes) 
spread  acdvatioo  through  foeir  connectloos  to  other  units.  One  does  not  trace  foe 
history  of  a  cognhlve  process  very  eadty  in  a  connectionist  modd  because  cf  fois 
feature.  Subde  (fiflerences  in  the  comiBCtton  weights  may  yield  large  diffeteaces  in 
modd  response.  At  s^  point  In  the  process,  it  is  foe  pattern  of  wdghm  foat 
nmaen,  not  the  presenoe  or  absence  or  a  psftladar  udL 

n  psnlnilei  stwiaftTi  nf  mnnnnlnnir  mnirlir  li  for  flnrilillty  nllnTTTT  frr 
inputs.  Bacansetheandsladepaadonthe  ofwe^htsoveragreatamny  units,  foe 
presence  or  absence  of  ahy  sfogte  unit  is  nsaally  uaimpottanL  Any  input  typically  is 
cheractedaed  by  a  great  numy  uniis. 


34 


Given  the  unique  and  ccmplementaiy  nature  of  the  two  appmadtes-tbe 
strength  of  die  symbolic  system  for  modeling  sequences  of  acdmis  and  the  stimigdi 
of  the  cooaectionist  appranch  for  modding  psttm  recognition-it  is  reastmaUe  to 
anticipate  hybrid  modds  that  will  capitalize  m  fodr  intfividual  strengths. 
Smprisingly.  few  tnie  hybrid  models  have  yet  emerged,  although  (me  suqiects  that 
the  number  under  devdopraem  is  somewhat  greater,  (hie  of  foe  best  examples 
availaUe  now  was  devdoped  hy  Wdter  Sdindder  to  model  ccrntrODed  or  auumiated 
processes  (Schndder  A  CRlver,  in  press). 

Ihe  remainder  <rf  this  report  describes  a  particular  hybrid  modd,  a  model  of 
schema  instantiatitm  in  arittunetic  problem  serving.  This  model  utilizes  both 
production  systems  and  connecdonist  artworks  to  iqxesent  stfoema  knowledge. 

Overview 


Types  of  Schema  Knowledge 

Sdiema  knosriedge  for  problem  serving  consists  of  four  major  components: 
constraint  knowied^  feature  knowledge,  {banning  knowledge,  and  executi(m 
knowledge  (MarshaD  1990;  in  press  a,  b,  c).  There  are  key  issues  invdved  in  eadi 
type  of  knowledge,  and  each  one  its  own  distinct  representation  in  foe  foil 

modd  of  a  schema. 

Constraiiit  knowledge  has  to  do  with  recognizing  patterns.  The  question  of 
interest  is;  Does  foe  stimUlin  proUem  comain  a  pattern  of  dements  suffidert  to 
activate  an  existiiig  sdiema?  TUs  prttem  recognition  is  accomi^shed  by  a 
connectioaist  oonqxment  (tf  foe  modd. 

Feature  knowledge,  on  foe  ofoer  hand,  has  to  do  with  dedding  whefoer  foe 
necessary  elements  are  provided  in  foe  probleoi.  given  tfiat  the  pattern  has  already 
bten  recogHi9sd  as  dutraaerisUc  of  a  sdiema,  so  foat  foe  schema  can  be 
instanriafed  This  is  a  question  best  answered  by  a  prodnetioa  system.  Thereisdso 
a  connectiooist  part  to  fois  knowledge.  Several  potential  patterns  wifoin  (me  sifoema 
may  exist  in  a  proUete,  and  the  nmst  teasoosite  or  most  likdy  one  for  s(duti(m 
needs  to  be  recognized.  TUs  is  a  spedd  case  for  oonqietitive  performance  by  all 
pattern  candiciates,  to  determine  ^rriiicfa  patlem  most  strongly  reflects  the  idoitified 
sdiema. 

Planning  knowledge  is  for  flie  most  part  sequential  and  consists  of  setting 
gods  and  srJeding  operations  for  ohiainiag  them.  Agdn,  a  production  system  is 
appropriate.  Plamiag  knorriedge  guides  foe  entire  pwfeiem-solvlng  process,  and  it 
cans  on  feature  knowled^  aral  constraint  knowledge  when  it  needs  more  detdl  or 
mote  ^tboratkm  about  foe  problem. 


35 


Rnally,  execution  knowledge  invcdves  the  step-by-^q>  execution  of 
alieacfy-learned  algodthms,  wbidi  9gain  calls  for  a  pnxhKticm  system.  Execution 
knondedge  c(»nes  into  play  (»ly  when  the  plans  cadi  for  it 

tliese  types  of  schema  knowledge  have  been  foe  focus  of  a  number  of 
experimental  studies  as  weD  as  foe  target  of  our  modding  efforts.  Each  eiqpetiment 
tyi^cally  spanned  several  weeks  tmd  required  subiects  to  participate  in  a  mimber  of 
different  tasks  at  various  times.  All  of  foe  mqxtiments  involved  foe  5<bryPraiMem 
Sob/er  (SPS),  a  computer-based  instroctiooal  program  about  mifometic  story 
problems,  and/or  foe  Problem  Solving  Environment  (PSE),  a  graffoical  systmn  in 
which  students  could  practice  what  foey  learned  under  SPS.  Hiese  systems  are 
described  elseufoere  (Marshall,  Barfouli,  Brewer,  &  Rose,  1989;  Marshall.  1991). 

Both  SPS  and  PSE  were  designed  around  schema  theory.  In  particular,  they 
were  developed  so  that  each  of  foe  four  comptments  of  know’'  described  above 
could  be  isolated  and  evaluated  as  students  acquired  their  scti^ak?*  knowledge.  Ibe 
results  of  foe  experiments  using  these  systems  are  given  in  several  ofoer  papers 
(Marshall,  1991;  Marshall,  in  press  a,  c).  Tbe  inqxxtance  of  foe  mqjeriments  for  foe 
present  rqxxt  is  that  foey  provide  ecqfoical  evidence  against  \foicli  our  ccmiputer 
models  can  be  evaluated. 

The  Performance  Model  of  Constraint  Knowle^e 

We  focused  our  attention  initially  on  models  of  constraint  knowledge.  We 
did  so  for  two  reasons:  First,  pioUem  stdving  typically  invrdves  two  general 
aspects:  reoognitioo  of  die  important  ports  of  the  problem  and  appropriate 
anfocation  of  tedui^ies  to  diese  components  to  obtain  a  solution.  The  recognititm 
aqxct  demands  ocmstraiiit  knowledge,  suggesting  that  constraint  knowledge  is  an 
inqxxtant  initial  point  of  access  to  stfoema  knowledge.  Second,  we  were  interested 
in  bow  individuals  onderetand  and  retain  new  information  about  a  concqpt  or  set  of 
concepts.  This,  too,  fdls  under  coastraimknondedge. 

Constraint  knowledge  can  be  modrfed  very  well  using  relativrfy  common 
comKCtioaist  modds.  Two  modds  were  created:  a  modd  timt  can  mimic  the 
performance  oi  sidijects  and  a  modd  which  kains  nfoen  given  approprime  feedback. 
Bofo  of  these  were  devdoped  and  evaluated  as  a  first  stq>  in  building  the  conqdete 
hybrid  modd. 

The  modd  that  shnulates  sut^ea  performance  in  iderfoiying  the  sthmtioos 
given  in  sinqde  story  problems  is  described  in  flill  in  Marshall  On  press)^  It  is  a 


^  The  artide  referenced  in  fok  citation  is  r^roihioed  in  die  dapter  inuaedarefy 
pcecedag  the  preaeat  one  in  foil  fad  report  Detail  of  the  modd  are  presented  there. 


36 


ttuee-layer  feed-forwanl  (and  feed-ltferal)  modd  consisting  of  iipit  units,  a  middle 
layer  ot  imeroonnected  units,  and  a  set  of  output  units.^  Hie  model  takes  as  its 
iqiut  a  binary  vector  having  elements  of  0  or  1,  adiicta  r^xesent  25  pioldem 
duuacteiistics.  The  modd  produces  as  its  output  the  identificatitm  of  one  of  five 
situations  fiiat  may  ocaff  in  file  story  proUem. 

The  input  mdts  are  connected  to  a  middle  layer  of  units  rqxesoiting  a 
subject's  knowledge  about  file  situatioos.  This  layer  of  knowledge  nodes 
corresponds  to  the  typical  hidden  unit  layer  fi>und  in  many  ccmnecfitm  models,  but  it 
is  not  hidden  in  this  instance.  The  nodes  here  derive  fiom  student  interviews.  Each 
student's  interview  about  the  situafitms  was  coded  and  transformed  into  a  set  of 
nodes  and  links  among  nodes.  The  modd  used  this  infotmaficm  to  derive  its  ou^wt 
response.  Again,  full  details  of  file  simulation  are  given  in  Marshall  (in  press). 

The  peifixmance  simulated  the  modd  is  student  teqxmse  to  a  computer- 
based  idemificafitm  task.  The  modd  perlbrmed  very  wen,  dinulating  the 
performance  of  a  number  of  students  exactly  and  accounting  fix  a  large  majority  of 
responses  for  the  rest  Both  correct  le^ionses  and  qiedfic  enors  were  modeled. 

The  Learning  Modd^ 

The  perftxmance  modd  provided  the  initial  framework  for  the  subsequent 
learning  modd.  The  iiqxit  units  for  this  modd  are  essenfiaUy  the  same  as  fix*  the 
performance  modd,  as  are  the  output  units.  A  layer  of  hidden  units  rqdaces  the 
knowlet^  nodes  that  derived  from  the  student  interviews  in  fiie  perftxmance 
modd.  Thus,  these  hidden  units  are  hypothesized  to  exist  but  are  unknown. 
Moreover,  file  nature  of  ttidr  cormectkms  to  the  ii^ots  and  oulpids  are  unknown. 
The  quesfion  oi  interest  in  fiiis  modd  is  wbedier  a  connectkxiist  modd  fiiis  ftxm 
can  learn  to  make  fiK  appropriate  dassificafioD  of  the  five  sitmttioas. 

The  optimum  munber  Udden  mdts  ftx  fids  case  is  undetermined. 
Theoretically,  it  dqiends  upon  the  optimum  number  of  knowledge  nodes  tbd  a 
student  should  acquire,  and  fids  number  is  not  known.  Ftom  file  instiucficmai 

^  The  modd  can  also  be  concdved  as  a  four-layer  feed-forward  modd  in  wUdi  the 
second  and  third  layers  omtain  the  same  units.  This  elfanindes  the  proUem  (d  havine 
activafioo  yeading  hterdly  aareag  oaks  at  aay  level  To  achkve  indepmideaoe  at  w 
levels,  we  maert  a  Aird  la^  of  sails  that  dupBcates  the  units  at  second  layer. 
OoHBwtioBS  east  frxxn  aH  ori^ad  nails  at  the  second  layer  to  their  conaterpwts  at  dm 
thkd  layer  aad  also  to  aay  other  mdts  to  vduch  thqr  mqr  be  concyndfy  rdidal 
^  hi  develofiing  this  mo^  we  have  drawn  inbstwXidly  from  the  models  described  in 
chuXerSofRuaieBaet  AMcOeBsad  (iSMbyRnmeQiait.iifadoa.ft'W^aBsasweii 
as  atbemitm  of  them  in  chapter  5  of  MeOniaBd  A  Ramdlwt  (1989). 


cqierimait  rqxxted  in  Mudiall  (in  {xess),  tbe  maximum  mimber  would  be  33,  die 
mimber  of  possibte  knowledge  nodes.  However,  no  student  ever  acquired  all 
possible  nodes,  and  it  is  not  ''tear  diat  having  all  of  diem  would  produce  maxinnnn 
petftxmaoce.  Several  students  made  more  dian  90  perceid  snccessfid  reqxmses 
widi  many  fewer  nodes.  In  5P5-based  experiments,  we  Observed  diat  students 
typically  acquired  an  average  of  14  nodes.  The  range  was  6  to  17.  In  gen^, 
Ittving  more  nodes  did  not  necessarily  mean  that  students  performed  more 
sucoessfeUy  on  the  task. 

Using  die  instructional  experiment  as  a  guideline,  we  include  14  hidden 
units  in  the  learning  model.  As  an  initial  simpliflcaiion,  the  model  is  constrained 
to  have  3  layers,  eliminating  die  feed-lateral  feature  ofdie  performance  model.  The 
daee  layers  are  die  input  layer,  containing  informadem  about  the  problem  to  be 
classified:  die  hidden  layer,  obtaining  units  that  oonespood  roughly  to  die 
knowledge  nodes  of  the  performance  model;  and  the  output  layer,  containing  the 
names  oi  the  five  possiUe  situad(»s. 

The  model  requires  specification  of  a  learning  rate,  ij  .  This  rate  defines 
how  strcHigly  the  model  reacts  to  incnrect  answers  with  each  trial.  The  learning  rate 
must  be  chosen  carefully  so  that  die  system  win  converge  to  the  correct  solution  in  a 
leascmaMe  amount  of  time.  A  learning  rate  which  is  too  large  may  cause  the  system 
to  converge  to  an  Incorrect  sttiutitm  udiile  a  learning  rate  r^di  is  too  smaU  may 
prevent  convergence  in  a  practical  amount  of  time  (and  perti^is  id  aU).  In  general, 
Ibr  the  network  to  staMliie  (i.e.,  for  learning  to  occur),  die  larger  dre  number  of 
hidden  units,  the  lower  the  learning  rate.  We  found  that  learning  rates  between  .05 
and  .10  were  most  satisfactory. 

For  dtis  model  we  also  indude  a  momentum  fiKfor,  fi .  This  factor  allows 
the  system  to  cany  over  leandng  fiom  previous  problems  uhen  new  problems  are 
presented.  As  Rumdbart  and  McOelland  (1989)  point  out,  widiout  die  indusion  of 
a  momeniam  factor,  die  system  may  converge  to  a  Tocal”  solution  and  staUIize 
there,  even  though  there  is  a  better  "global"  sdution  (p.  132).  A  suitaUy  large  /i 
prevents  die  modd  fiom  getting  stock  in  such  local  sdutiems.  In  addition,  a 
momentum  term  tends  to  qteed  tq)  the  model,  because  it  allows  die  spedfiedion  of 
a  higher  learning  rate. 

Testing  die  modd  cooeqxmds  to  ruiming  it  over  enough  trials  for  it  fo 
reach  stxne  pre-detenniiied  ertterion.  Each  trid  proceeds  through  the  steps  listed 
b^rar 

o  preseatatton  of  a  randomly  sdected  iiqmt  vector, 
o  forward  prapagadoo  of  activmkm  from  iiipitt  to  hidden  uohs 
and  fitoaa  hidden  10  ompoi  units: 


38 


o  calculati(Hi  of  the  errms  associated  with  each  ou^t  unit; 

o  backr'nid  imq)agation  of  enors  from  output  to  hidcten  units; 

o  modificatiCHi  of  the  weights  of  the  connections  between  all 
layers  of  units  based  on  tte  errors. 

Each  of  die  main  components  the  model  are  described  briefly  below. 

The  Task.  The  task  for  which  diis  model  was  developed  is  to  learn  the 
i^^Hopriate  classification  of  a  set  of  100  stcxy  problems  according  to  the  situations 
depicted  in  them.  Each  prOUem  is  represented  by  a  set  of  diaracteristics  vdiich  the 
model  uses  in  making  its  classification.  Hve  output  responses  are  possible. 

Inputs.  The  inputs  to  the  model  are  the  set  of  100  binary  vectors  neariy 
identical  to  the  ones  de^bed  for  the  perfomance  model  (see  above  and  Marshall, 
in  press).  Each  vector  represents  one  arithmetic  stray  problem.  The  problem  is 
coded  according  to  the  {xesence  or  absence  of  several  general  characterisdcs. 

The  difference  between  these  iiqjut  vectors  and  those  of  the  performance 
model  is  the  inclusion  here  of  coded  infiamation  about  the  form  of  the  question 
stated  in  the  problem.  In  the  performance  model  and  in  the  empirical  studies 
simulated  by  it,  the  items  were  situational  descriptions  and  contained  no  questions. 
Both  the  learning  model  and  the  hyiaid  model  described  below  require  {HOblems 
rather  than  situation  if  we  ate  to  model  the  full  problem-solving  process. 

In  general,  diete  are  two  options  for  item  presentation:  eifiier  die  entire  set 
is  presented  again  and  again  in  scnne  fixed  order,  making  an  orderly  cycle  through 
the  entire  stimulus  list  and  insuring  that  eadi  item  is  presented  an  equ^  number  of 
times;  or  each  presentation  is  randomly  determined  at  the  time  of  the  trial,  so  that 
every  item  in  the  set  has  an  equal  chance  of  being  selected  on  every  trial.  We  have 
implemented  the  latter,  primarily  because  we  wished  to  avoid  any  possible  order 
effects  and  also  because  items  were  always  randomly  generated  for  students  in  our 
empirical  learning  experiments. 

Outputs.  The  model  outputs  cone^nd  to  the  identifications  of  situations 
given  in  the  story  problems.  For  each  protdem  presented,  the  model  can  make  one 
of  five  possible  re^xmses,  one  for  each  situatioo. 

Input  Units.  In  a  single  trial  the  layer  of  iiqxit  units  is  comprised  of  one 
iiqxit  vector.  Each  elemait  of  the  input  vector  takes  a  value  of  1  (if  the 
characteristic  it  rqxesertts  is  present  in  the  selected  item)  or  0  (if  it  is  absoit).  The 
irqiut  vector  to  the  learning  model  contains  the  original  25  elements  used  in  the 
performance  model  [dus  the  addititmal  2  elemmus  to  code  the  question,  resulting  in 
a  27-element  vector. 

Hidden  Units.  The  middle  layer  of  die  modd  contains  hidden  units.  Each 
of  these  is  connected  to  every  input  unit  in  the  layer  below  it  and  each  in  turn 


39 


oontiibutes  to  the  activatioa  of  all  output  units  above  it  As  mentioiied  previously, 
fliere  are  14  hidden  units. 

Output  Units.  Each  situation  is  iqxesented  by  one  output  unit  On  each 
trial,  fidlowing  activation  from  the  hidden  unit  layer  dii^y  below,  each  ouQnit  unit 
has  some  level  of  activation.  Ibe  one  with  the  highest  levd  is  the  model  teqxmse 
for  that  trial. 

Bias.  EatH  hidden  and  ou^t  unit  has  a  Uas  assodated  with  it  The  das  is 
added  to  die  incident  activaticm  tqxm  die  units  and  funcdons  like  a  dueshdd  for  the 
unit  (cf  Rumelhait).  If  insufllcient  acdvadcm  is  received  at  die  unit  to  overcome  the 
effect  of  the  das,  then  the  output  of  die  unit  will  be  insignificant 

Input-UhHidden  Weights.  As  in  most  amnecdonid  models,  eadi  of  the 
iiqiut  units  connects  to  each  of  die  units  in  the  layer  immediately  above  it  i.e.,  the 
ddden  unit  layer,  and  each  connection  has  its  own  unique  wdg^  When  an  itqiut 
vector  is  presented  to  die  model,  activatitm  spreads  fiom  the  hqwt  units  to  the 
ddden  units.  The  amount  of  activation  qxead  is  determined  in  part  by  die  strength 
(i.e.,  weight)  (tf  die  omnection. 

Hidden-to-Output  Weights.  Each  ddden  unit  is  connected  to  every  ouqiut 
unit  and  each  has  a  strength  or  weigd.  The  values  of  these  weights  are  also 
randomly  generated  for  the  initial  trial,  using  die  same  cmstraints  as  fie  the  irqxit- 
to-ddden  wd^its. 

Model  Parameters  and  InitializatiiHi  Values 

The  modd  reqidres  that  two  paianMers  be  set  the  learning  rate  17  and  the 
mmnentum/i.  For  most  our  tests,  we  have  used  V  =  .07  and  ft  =  .9. 

Additionally,  die  model  requites  diat  eadi  unit  i  have  a  das  term  Pi  and 
that  each  connection  bdween  a  pdr  of  units  t  and  ;  have  a  starting  weigd  0^,  The 

das  terms  and  die  weigds  are  generated  irdtiaily  fiom  a  uniform  distribution 
taring  from  •.OQS  to  -f.OOS. 

FinaDy,  the  leamiiig  criterion  nrast  be  set  This  requires  dmoring  a 
toletanoe  value  diat  indcates  bow  ma^~if  any-ertoR  will  be  allowed  and 
qiecUyii^  how  large  the  output  value  nnm  be  in  order  to  be  conrideredcooect  We 
use  a  90  percmit  tderanoe  standard;  dud  te,  die  modd  must  conecdy  identify  at  least 
90ddiel00testitam.  TobeoonBldeiedaconectte4)on8e,lheq|)Ixoptisieonq«t 
unit  must  have  a  value  that  is  at  least  .25  larger  dual  die  next  largest  output  unit 

Under  the  parameter  selections  and  Idtiaiizatioo  values  described  here,  die 
modd  convergn  at  apfUoxiaBateiy  7,000  trials. 


40 


Figure  1:  The  Connectioniet  Network 


TechoicilDetaflBoftlie  Learning  Modd^ 


Tbe  model  consists  (tf  die  tbree-Uyer  network,  shown  in  Hgute  1,  wittieach 
layer  ooiqidsed  of  a  set  of  units.  l^pteaHy,  one  thinks  of  die  iiqmt  units  as  being  at 
the  bottom  levd  of  the  model  and  die  ot^Mit  units  at  die  top,  as  in  Figure  1.  The 
hidden  units  make  up  the  middle  layer.  Bacb  layer  is  fully  connected  to  the  layers 
imiiMidiateJy  above  and/or  below  it.  as  Shown  in  Rgute  1. 

We  define  the  fiifiowing  elements  of  the  model: 
die  teamiog  rate; 

H  the  momentum  fimtor, 

cif  die  acdvadoidua  accumulates  in  each  hidden  or 
ou^t  unit  i\ 

Xf  die  acdvatkm  diat  spreads  from  unit  i  to  units 
idiovett; 

die  weight  associated  widi  the  connecdon 
between  units  i  and  J; 

fit  die  bias  associated  widi  unit  i ; 

t,  die  target  level  oi  ou^t  acdvadoa;  externally  set 

as  1  ifdKOuqMtunitfiqxesemsdieconea 
dtuadon  cr  0  if  it  does  not; 

€i  die  enor  associated  with  unit  I . 

Ihe  modd  learns  by  prooessiag  an  hqwt  vector  and  frxward  propagating 
acdvadoa  from  the  lowest  level  to  die  U^i^  by  calculating  ^  error  at  this 
highest  levd  and  diea  backward  propagtdiiig  die  error  down  duough  all  levels,  and 
by  adjusdag  all  wdgtas  comiecdiig  pairs  cf  acdvated  units  accordingly. 

The  acdvafion  spreadltig  out  from  a  mdt  is  defined  as: 

1  or  0  if  /  is  aa  inpot  unit 

A,  =  1 

i+c“*'  if*  i»»  hidden  or  ou^iut  unit 


^  ABfKegsama  for  the  Bwddim  dill  rqxwt  were  written  in  C-f-f  and  run  on  a  PC-80486 
wodatadon. 


where  is  the  Uas  associated  with  unit  i  and  a,  is  the  activation  diat  has 
accumulated  into  the  unit  from  die  layer  of  units  below.  The  accumulated  acdvmion 
is  determined  by: 

1 

If  i  is  a  hidden  unit,  j  refers  to  ii^t  units,  and  the  summation  occurs  over  all 

weights  between  a  hidden  unit  and  the  input  units  at  the  levd  below  and  the  of 
the  iipit  units.  If  i  is  an  ou^t  unit,  j  refers  to  hidden  units.  Eadi  of  the  units  7  at 

the  level  below  unit  i  will  have  an  associated  which  influences  unit  1.  Note  that 
Oi  is  defined  only  for  hidden  and  ouQiut  units. 

The  spread  of  activation  begins  widt  the  input  units.  Those  widi  values  of  1 
activate  their  associated  hidden  units  which  in  turn  pass  some  of  the  activation  to 
the  ou^t  units  (by  means  of  their  A|*s)  .  When  die  forward  propagation  of 
activation  is  con^ileted,  for  every  output  unit  1  the  difference  between  T,  and  A,  is 
used  to  cooqiote  the  error  : 

where  the  derivative  can  be  expressed  as 


|^=A,(1-A,) 

tfa, 

thus  yielding  a  final  form  for  die  uaif  s  error  of 

£,=(T,-A,)A,a-A,) 

with  afl  terms  as  defined  above.  The  error  signal  is  then  passed  to  die  bidden  units 

by:  ^  ^ 

e,  *  2*/»fA,(i~A,) 

/■i  M 

foreacbbiddeiiaaitl.  The  sumamdonoocaes  over  aUoaVut  units  Input  units  do 
not  aocuondate  enor. 


After  tbe  error  signal  has  propagated  backwards  through  the  network,  the 
wei^its  ate  adjusted  by: 

®^(0  =  a»#(f-l)+Aa^(0 

where  t  indicates  the  cunent  trial  and  (/  -1)  is  the  previous  trial.  The  amount  of 
chai^  is  determined  by: 

Adi^(r)=  ij€,A;+/i(A«,(t-l)) 

Note  ttiat  wij  «!»nanartng  frnn  input  units  wifli  initial  values  of  0  will  receive  no 
adjustment  A  similar  adjustmeiU  is  made  for  die  bias  terms,  with 

and  the  amount  of  dumge  is  oon^suted  by 

A^W  =  TiBiX, 

After  the  weights  and  biases  have  been  adjusted,  the  activation  and  error 
terms  are  reset  to  zero  for  the  next  trial.  The  only  carryover  firm  trial  to  trial  is 
comained  in  the  weights,  die  biases,  and  dielr  delta  values  Arv^  ^  Afi,)- 

The  model  runs  with  alternating  leamirig  and  testing  phases.  The  learning 
phase  runs  in  blocks  rtf  100  trials.  At  the  condusioa  of  every  Hock  rtf  trials,  the 
model  suspends  die  badEwarrb  ptopagtf  on  of  error  and  runs  a  performance  test 
over  all  ii^  vectors  to  determine  adiedier  it  has  yet  reached  a  specified  critetion 
for  successftd  learning.  Etoing  a  testing  phase,  the  modd  maintains  an  unchanging 
so,  of  weights,  widdi  is  die  set  readied  on  die  last  trial  rtf  the  previous  learning 
phase. 

The  criterion  fiar  teaming  is  the  ootrect  response  to  at  least  90  rtf  the  100 
ir^ut  items,  widi  "oonectness*  estabUdied  as  the  acdvation  rtf  die  qipropriaie 

output  imit  i  nd  widi  at  leait  .25  larger  than  die  nest  hugest  acdvation  valnelbr 
any  ou^iut  uidt  In  pracdoe,  die  system  typically  converges  widi  94-97  items  oonect 
in  the  testing  phaae.  Gdvun  dm  fimt  that  X,  mnges  only  ftom  0  to  1.  a  dUfereace 
between  two  vdues  rtf  .25  is  HiHy  aigntficaat 

Oaring  die  testing  pbtm,  each  of  dm  100  iapat  vectors  is  paMcnied  In  a 
nxM  omer  id  me  moua  mki  a  is  fcnerwco  soiiosra^  me  spneo  of 


44 


activati<»i  as  befixe.  Hie  reqioase  is  scored  as  cottea  or  inaxtect.  and  die  next 
vector  is  presented.  If  tbe  model  Ms  to  leadi  die  defined  criterion  (i.e.,  eira  on 
mcxe  tban  10  of  tbe  items),  tbe  learning  phase  resumes  with  another  block  oi  trials. 
When  the  model  teaches  tbe  defined  criterion,  the  weights  are  stored  for  later  use.  to 
be  described  below. 


The  Hybrid  Model 

The  hybrid  model  developed  to  solve  story  proUems  has  tbe  form  riiown  in 
Rgure  2.^  It  has  three  main  oxnpooents:  two  ptoducdon  systems,  rqxesented  in 
the  figure  by  the  decision  boxes  and  arrows,  and  a  connectionik  networic, 
rqxesented  in  the  figure  as  a  set  of  nodes  and  links.  All  of  these  interaa  with  eadi 
other,  indicated  in  the  figure  by  tbe  arrows  leading  into  and  out  of  die  rectangle  in 
die  middle  of  the  figure.  The  coonecdonistmodd  of  Figure  2  is  die  idendcal  model 
described  under  the  kming  modd.  K  perftnns  here  in  its  testing  mode;  at  this 
point  the  modd  is  presumed  to  have  learned  the  dassificadmis,  and  no  addidonal 
dianges  in  wdghts  occur.  The  wdghts  used  to  con^joie  a  re^ionse  for  the  I^bcid 
modd  ate  diose  that  were  saved  when  die  kaming  model  readied  its  temning 
oiteriotL 

The  probteffl  iiqwt  to  the  fiill  modd  now  indudes  more  dian  die  vector  of 
characterisrics  used  in  the  performance  and  learning  modds.  In  addition  to  this 
vector-which  tematais  die  input  to  tbe  connectiooist  part  of  the  hybrid  modd- 
piobiem  input  consists  of  spedflc  detail  about  the  quantities  Ibund  in  tbe  problem. 
This  infonnatioa  is  encoded  (fividing  tbe  problem  into  several  clauses.  Each 
dause  contains  three  types  of  information:  owner,  otject.«id  time. 

An  exaoqiie  of  dause  cotBng  for  a  spedflc  problem  is  given  in  Table  1. 
Owner  «*»«**"«  two  Adds:  name  and  Qrpe.  Oi^ea  contains  four  Adds:  name, 
type,  vdue,  and  action.  The  action  will  contain  such  information  as  Is  necessary  to 
determine  whidi  aritiunetic  operation  to  use.  For  example,  an  action  might  be 
itKtease,  decrease,  more,  or  less.  The  flnal  type  of  danse  information  is  Time, 
wtddiconiaiasjustoneflddtiatincflcatesardativetinieof  occurrence  witidn  tiie 
prabtem.  A  dause  may  contain  multiple  owners  nd  multiple  otifects,  and  it  qmy 
omit  time.  The  danse  information  is  provided  as  input  to  the  noaUer  protfciction 
system  Qntflcatedly  the  anew  in  Rgnte  2). 


45 


Figure  2:  The  Hybrid  Modei 


Wbea  a  ptoblm  is  pfesealed  to  tte  modtii.  the  coonectkwist  networic  mato 
die  apfeopriate  reoognitiQa  ot  the  situatioa  usiog  die  iofnit  vector  as  befbce,  and  it 
passes  that  iafinudoo  into  a  commno  area  accessed  by  all  parts  oi  the  modd 
(represented  by  the  lectangle).  Roib  here  the  ittfocmstion  is  available  to  the  snudler 
producdon  system  (on  the  M  side  of  die  fignre).  This  production  system  has  as  its 
goal  the  recogiddoa  of  idevmd  dements  ot  the  problem  once  the  sduadon  is 
known.  It  passes  its  results  back  into  die  common  area  to  be  used  by  die 
coimrctionigt  network  again  if  necessary  or  by  die  larger  production  system.  «4iicta 
win  produce  a  numerical  solution. 

The  inftirmflrtnn  derived  fiom  the  Clauses  is  used  by  die 

producdon  system  to  determine  whidi  values  of  die  problem  are  known,  udddi  are 
unknown,  and  thdr  rdationship  to  each  odxr.  Just  as  the  reoognidon  oi  the 
situation  uses  oonstraint  knmde^  diont  story  problems,  the  sdectkm  of  relevant 
pieces  of  the  problem  uses  feature  knotdec^  To  fllustiate  vrtiat  we  mean  by 
feature  knowledge,  we  describe  briefly  die  diaitge  situatioii.  A  change  situation  is 
diaracterized  by  a  permanent  altermion  over  dme  in  a  measutaMe  cpiandty  of  a 
single,  qiectfled  thing.  Thus,  die  modd  must  confirm  diat  only  one  d^  is 
represented  in  die  danses.  There  are  dne  aspects  to  a  change  situation:  a  starting 
amount,  an  amount  by  vrtddi  it  is  to  be  dumg^  and  an  ending  amounL  Themodd 
must  check  that  diere  are  three  available  amouitfs,  even  if  one  of  them  is  unknown. 
A  change  takes  place  over  thne,  so  the  modd  looks  for  three  distinct  times  to  be 
represented  in  die  change  situatioa  The  production  system  works  throng  the 
danses,  ocmfirming  that  similar  denaents  are  involved  and  pladng  the  valaes  fiom 
the  problem  on  a  list  dmt  can  be  used  ^  the  larger  production  systm  in  the  modd. 

Thus,  for  the  hybrid  modd  of  F^ure  2,  the  constraint  knowledge  is 
modeled  by  die  ooiuiectionist  network,  and  the  fedure  knoadedge  is  modded  by  a 
production  system.  In  the  ftdl  hybrid  inodd  erf  schema  knowledge,  televmufedure 
and  constraint  knoadedge  are  used  to  plan  a  sdution.  Thus,  the  input  to  die 
plarmlag  component  of  the  hybrid  modd  is  die  output  fiom  the  feature  production 
system  coupled  with  the  ou^nt  fiom  the  cormeettonist  modd  of  constrdnt 
knoadedge.  Together,  they  provide  sufllcienr  Marmation  for  die  plamring 
production  system  to  set  a  series  of  goals  and  to  caO  on  the  appropriate  execution 
knordedgeforachievlagdieaL  Thble  1  iOosceaiessoaDe  of  the  production  rules. 

Of  die  four  types  of  knoadadge  that  comprise  a  schema,  we  consider 
execution  knowledge  to  be  dm  lesat  latemtiag.  and  we  hswe  naade  litfle  attempt  to 
modd  how  iiuBvideds  learn  the  bade  aridunattc  operations.  We  take  m  given  dat 
these  are  in  plaoe.  Our  aqmment  here  Is  dad  diem  akeady  exist  production  qntems 
dealgned  to  model  the  acqpiWtion  and  use  of  the  algorifluns  of  addttlon,  snbiiaction. 


m 


multifiUcalkm  and  diviskML  The  aiodel  here  focuses  instead  <m  the  selection  of  the 
apprapdate  values  Ikom  die  proMon  to  use  in  canying  out  necessary  computadcnis. 
Thus,  the  enon  that  can  he  modded  are  those  reflecting  mistalcrs  in  sdecdng  pieces 
the  proMem  or  in  selecting  an  openttoo  to  he  canted  out  Errors  of  onnputii]^ 
sre  not  possible  (i.e.,  3  x  4  «  7).  The  consequence  of  dds  assumption  about 
algorithms  is  that  we  have  not  condmcted  a  separate  production  system  to  make  die 
computations,  aidiough  it  would  be  easy  to  do  so.< 

The  model  is  instantiated  with  ii^ut  to  die  connecdonist  modd  in  the  lower 
portion  of  Figure  2  The  iiqnit  consists  of  a  dngle  binary  vector  rqxesenting  all 
infonnatkm  in  the  multi-step  problem.  Tims,  pointers  to  more  dian  one  situation 
typically  occur.  The  comiectiooist  network  ideittifies  die  most  salient  situation,  and 
passes  that  information  to  the  feature  idmitifler  (rqiresemed  in  Figure  2  as  die 
smaller  of  the  two  production  systems).  Using  die  output  from  die  connectimiist 
networic  togedier  with  the  clause  inftxmation,  diis  port  of  die  model  determines  the 
best  configuration  of  data  to  represent  die  selected  situatioa  Several  configurations 
may  be  possible.  The  productitm  system  selects  a  subset  of  clauses  to  rqxesent  each 
one.  If  there  are  multiple  configurations,  each  one  is  dien  evaluated  using  die 
original  connectioidst  model.  For  each  configuration  the  production  system  creates 
a  new  input  vector  diat  contains  only  the  information  of  die  sdected  dauses.  The 
output  vdues  assodated  with  each  iqpin  vector  are  compared,  and  the  irput  vecfor 
teading  to  the  highest  value  is  sdected  as  die  immediate  problem  to  be  solved.  The 
identifled  situation  and  its  sidiset  of  dauses  are  dien  passed  tt>  the  planning 
component  (rf  die  schema.  Thus,  feature  knowtedge  0.e.  the  production  system)  and 
contirdnt  knowtedge  (l.e.,  die  coonectionist  networic)  interact  to  provide  the 
necessary  infonnadon  that  wifi  be  used  to  phut  die  solutioa 

A  plan  begins  with  the  creation  a  goal  stack  in  which  the  top  levd  goal 
is  to  produce  a  numericai  sotmiott.  AdtSdomd  goals  are  added  to  the  stack  and 
removed  as  diey  are  achieved.  A  number  of  dilfomt  goals  are  addressed  by  die 
production  syston.  Some  have  to  do  with  tocating  the  unknown  in  the  problem. 
Others  center  on  canying  out  dieiqpropriateoooqiatatioos.  Like  feature  knotriet^e 
and  constraim  knowtei^  planning  knowledge  is  adtenuh^iedllc.  The  model  uses 
its  knowledge  about  the  currem  schema  to  devdop  ptans  for  solving  die  problein. 

Table  1  illustrates  a  nuBEta' of  dUferemgo^  and  the  steps  die  modd  takes 
to  adiieve  them.  The  model  attempts  to  solve  the  first  subproMem  it  recognizes.  If 
it  is  successfol  at  this  point,  the  solutioa  is  pMsed  back  to  the  placming  coiqioneiit 

‘  Aa  adtfitiaaal  reaioa  to  oak  the  aMdefing  of  eoaqMtational  eiron  k  diat  the  satyects 
vfeoaa  paifonnace  we  have  riadted  rai^  aaaks  Aw  errors.  Al  of  oar  sAtects  have 
been  ooDege  tfadeats  wiA  poor  pRMW-tidviag  skSb.  They  are  proficient  in 
eoapatation  but  not  ia  probleai  soMng. 


\«^ch  then  must  determine  whetho^  the  entire  problem  has  been  solved  or  only  a 
sub-problem.  A  number  of  things  feed  into  diis  determination.  Hrst,  the 
connecdonist  network  is  called  upon  to  find  any  other  plausible  situations  after  the 
first  one  has  been  removed  finm  the  problem.  A  check  is  carried  cHit  to  see  if  there 
are  additional  unknowns  anywhere  in  fite  known  protdem  structure.  If  potential 
subproblems  ate  discovered,  their  clauses  and  relevant  input  information  is  fed  back 
into  the  model,  and  the  entire  cycle  begins  agaiiL  If  no  additional  sub-problems  are 
recognized,  the  model  produces  as  its  answer  the  computed  value  for  the  last  sub¬ 
problem  it  solved.  Table  1  contains  a  complete  trace  of  the  model's  activity  f<v  a 
multi-step  proUem. 

The  hylxid  model  is  able  to  solve  problems  having  more  than  one  unknown. 
Such  problems  are  ojnunon  in  arithmetic  arul  algebra,  and  they  are  fiequently 
studied  because  students  do  not  rxHitinely  solve  them  easily.  The  different 
components  of  the  model  pass  infmnation  back  and  forth  as  necessary.  For  some 
problems,  a  re-cycling  tiuough  the  connectionist  network  will  be  unnecessary 
because  ordy  one  configuration  will  be  posable.  For  other  problems,  the  model 
may  move  back  and  forth  between  the  connectitmist  network  and  the  production 
systems  until  it  develops  enough  informaficMi  to  create  a  workable  plan. 

Thus  fv,  the  hybrid  model  successfully  solves  problems  of  the  type 
illustrated  in  Table  1.  ■  Extmisions  of  the  model  to  deal  with  more  complex  p^lems 
are  ongoing,  as  are  comparisons  of  human  and  model  solutions.  Ihe  initial  findings 
are  encouraging.  The  hybrid  model  presented  here  can  srdve  single  or  multiple-stq> 
problems,  and  it  produces  solutirms  fiiat  appear  similar  to  human  subjects'  solutions. 


49 


TUUe  1:  Hybrid  Model  Oi^^t  for  a  Multi-Step  ProUem 


Model  Outfuk 

Joe  woo  $100  is  the  (tats  lotteiy.  He  q)eat  eoine  of  it  on  toyi 
(Or  hie  two  chUdfea.  He  bought  a  doOtw  Sue  that  ooctSlS  end 
heboo|hta«n<fedbearlOrEUeathtteo(t$28.  Hamrmadmt 
hit  lotteiy  wiBBiaii  did  he  have  after  he  bought  the  toya? 

111010000010100001001000100 

owner  Joe  pciaon 

object  dollartdollan  lOOJXnOOOiMie 

time  0 

owner  Joepeaon 

object  doOan  doDanUNKNOWN  decreate; 

toy*  toya  UNKNOWN  none 
time  1 

owner  Jbepetaon 

object  amount  doUan  2S.OOOOQO  decreate; 

doU  toyi  1.000000  incteate 
time  1 

owner  Joe  petaon 

object  amount  doOaia  28000000  ilrrmeaf: 

ttuffed_bcar  toya  1 HOOOOO  hvcittef 
time  1 

owner  Joepemon 

object  doUandoDaia  UNKNOWN  none 

time  2 

0Jfi9  0J88  OJOl  0J80  0J21  - >  GR 

'Combo;  1 
Combo;  2 
Comba  3 


owner  Joepeiaon 

object  doOaia  doOaia  UNKNOWN  decreate; 

toya  toya  UNKNOWN  none 
time  1 

owner  Joepemon 

object  aaaooMdaaBm2SJ)000Q0daGreaae; 
doB  Mya  tOOOOOO  hMrawe 


otmer  Joe  peiaoa 

ofejcct  MBOvMt  ^qShs  3ftj0006QO 

ttaOhd Jbear  toya  UWnoo  iMOMe 
than  1  “ 


Annotated  Description  ofOn^nt: 

PfoUemText 


b^t  Vector  tor  Connectionitt  Model 
RntOanae 


Second  Qaute 


Third  Clause 


Fourth  Qaate 


nfthOauae 


Fliat  tOb-ptoUem  identificatioo  by  connectionist  model. 

The  poatible  coofigmationt.  *  indicatec  the  one  that 
jMdt  the  higheat  acthmtion  value  (found  via  tmaO 
production  qntem  and  evaluated  with  connectionitt 
model). 

The  dauaec  that  contribute  to  the  configumtion 
thr  f^^***'**‘'*iitfT  M  bcitf.  TTw 
identificatioo  of  the  GROUP  titaation  and  the  ciaufe 
MOrmation  it  pamed  to  the  neat  component  of  the  model 
which  teta  the  initial  goal  and  detomhiccwidGfa  values  will  be 
uaed  in  aohnng  the  problem. 


50 


Table  1  continued: 

MoMOtOfut: 

Entcting  EaecnteRuk* 

ProdadioB  Rule;  25 

Ooal_StKks  ID_NUMBER-5UBGROUPS  SOLVE 
PnbimVaiueK  UNKNOWN  25.00  28.00 


Pioductioa  Rule:  26 

Goel_St*cfc  ID_NttMBER.-SUBGROUPS  SOLVE 
Proi^ValneK'  UNKNOWN  25.00  28.00 


Produetioa  Rule:  27 
Goal_Stack:  SOLVE 

PrabiraValuec  UNKNOWN  25.00  28.00 

Production  Rule:  28 

Goel_St«ck:  ID_PART_OR  SOLVE 

ProbiefflVahieK~  UNKNOWN  25 JX)  28.00 


Productioo  Rule:  30 

Goel_Stack:  SUPERGROUP  ID  PART  GROUP  SOLVE 
ProbimValuet:  UNKNOWN  25l»  28X0 


Production  Rule:  32 

Goal_Siaek:  SUPERGROUP  ID  PART  GROUP  SOLVE 
ProbiraVataec  UNKNOWN  2Sl»  2800 


Produetioa  Rule:  33 

Goal_Staclc  SUPERGROUP  ID_PART_OROUP  SOLVE 
PtoMmYoIwk  530025002800 

Produetioa  Rule:  34 

Goel_Stack:  ID_PART_GROUP  SOLVE 
PnMmVdueK~  530025002800 

Produetioa  Rule:  29 
Gaai.Sttdc  SOLVE 
Pnl^VifiMK  530025002800 

Produetioa  Rule:  24 
ProMemVataei:  33002500»00 


Production  Rule:  0 
Pettiai  Aatoer  -  53080000 


Amnotated  Description  of  Ou^ut: 

IF  {the  top  goal  it  SOLVE,  the  situation  is  GROUP, 
and  the  number  of  subgroups  is  not  known} 

THEN  (add  a  new  goal  of  identiQnng  the  number  of 

subgroups.} 

IF  (the  number  of  subgroups  is  unknown  and  the  goal 
is  to  Gnd  the  number  of  subgroups} 

THEN  (count  the  number  of  subgroups  and  store  the 
value} 

IF  (the  goal  is  to  find  the  number  of  subgroups  and 
that  number  is  now  known} 

THEN  (delete  the  goal  bom  the  goal  stack} 

IF  (the  goal  it  SOLVE,  the  situation  is  GROUP,  the 
number  of  subgroups  it  known,  and  there  is  an 
unknoam  in  the  problem} 

THEN  (set  a  new  goal  to  find  out  which  part  of  the 
problem  it  unknoam} 

IF  (the  answer  is  unknown,  the  goal  is  to  identify  where 
the  unknoam  it  located,  and  if  it  it  in  the 
supeigroup  locatioo} 

THEN  (add  the  goal  of  computing  the  supergrot^  to  the 

goal  stack} 

IF  (the  answer  k  unknown,  and  the  goal  is  to  find  the 

supergroup} 

THEN  (add  an  subgroup  values  and  store  the  result  as  the 
answer} 

IF  (the  goal  is  to  find  the  supergroup,  and  the  answer  is 
known} 

THEN  (stme  this  information  in  the  problem  values} 

IF  (the  goal  is  to  find  the  tupeigrotq)  and  it  is  known} 

THEN  (delete  the  goal  bom  the  goal  stack} 


IF  (the  goal  is  to  identify  the  missiag  port  of  a  group 

problem  but  there  are  no  miaaing  ports} 

THEN  (delete  the  god  bom  the  goal  stack} 

IF  (the  god  is  to  solve  the  problem  but  tiiere  are  no 
uakaowns} 

THEN  (remove  the  god  bom  the  god  stack} 

IF  (the  god  stack  is  co^fy} 

THEN  {ratatntiwmHwer} 

The  fiflt  stib^proMam  hm  been  aoirad. 


51 


Table  1  condnued: 

Model  OiOpmt: 

0481  0357  0381  037D  0368  - >  CH 

’Combo:  1 


owner  Joepeisoa 

object  doUmdoOanlOOOOOOOOooiie 

time  0 

owner  Joepemn 

ol^  dolUit<loUenS3J100000<kaeaie; 

lojn  toyeirNKNOWN  none 
time  1 

owner  Joepenon 

object  doUeis  doOan  UNKNOWN  none 

time  2 

Eatering  ExecuteRules 
Production  Rale:  1 

Goel_Stncfc:  ID_PART_CHANGE_SOLVE 
ProWmVaines:  IQOOO  53D0  UNDIOWN 


Production  Rule:  16 

Gaal_Sack:END  ID_PART  CHANGE  SOLVE 
PrabiianVahMe:  IQOOO  33.00  UNKNOWN 


Production  Ride:  U 

Goal_Stnck:  END  ID_PART  CHANGE  SOLVE 
ProbimValocc  IOOWSSaTuNKNOWN 


Ptoductioa  Rule:  19 

Ooid_Slack:  END  ID_PARr  CHANGE  SOLVE 
ProbimValacs:  10O0633ja0~47D0 


Pnductioo  Rule:  20 

GQai_Siaek:  ID_PART_CHANCE  SOLVE 
ProblemVnhNK  10000*3100  47D0 

Production  Rule:  3 
Ooal_Smck:  SOLVE 
ProbiraValaaa:  100u00S100471» 


Production  Rale:  4 
PtaMamVilniK  UOlflO  310047:00 
Partial  Anannr  -  47000000 

HnalAumtef  ■47000000 


Annotated  Description  of  Model  Output: 

At  Ibis  point  the  system  re-emiines  the  origmnl  iqiut  to 
detetmine  if  there  are  other  shundons  containiag  other 
problems  to  be  sohcd.  It  finds  a  CHANGE  and  there  »  only 
one  possible  configutation. 

The  necessaiy  clauses  are  identified  for  die  ptanoittg  and 
execution  components. 


The  system  recognizes  that  there  is  an  unknown  value  Cm  the 
object  toy  but  disregnids  it  in  favor  of  the  selected  diange 
configuiatioo. 


The  productiott  begins  a  new  cycle: 

IF  {the  goal  is  to  solve  the  problem;  the 

situation  isCHANCE;  and  there  is 
an  unknown  value  on  the  value  litt} 

THEN  {add  a  new  goal  of  identifying  which 

part  of  theChmig*  situation  is  unknown} 

O'  {the  goal  is  to  identify  the  which  pan  of  the 

problem  has  an  unknown  and  if  the  last 
element  of  the  valuejist  is  unknown) 
THEN  {add  a  new  goal  of  finding  the  end_Tesait} 

O'  {if  the  goal  is  to  find  the  end_Rsult  and 
the  direction  of  change  is  negntivc} 

THEN  {set  ANSWER  to  the  diSeience  between  the 
start_ainount  and  the  amount  of  change) 

IF  {the  goal  is  to  find  the  endjRsult  and  there 

is  only  one  unknown  value  in  the  vaiue_iist 
and  if  a  value  is  known  for  ANSWER) 
THEN  {replace  the  unknown  in  the  valuejirt 

with  the  value  of  ANSWER) 

IF  {the  goal  is  to  find  the  ead_nsalt  ud 

there  are  no  unknowns  in  the  valae_list} 
THEN  {delete  the  goal  Grom  the  goal  stack) 

IF  {the  goal  is  to  identify  a  miasiag  part  but 
all  parts  are  known) 

THEN  {delete  the  goal  from  the  goal  stadc) 

IF  {the  goal  is  to  solve  die  problem  but 

there  are  no  unknown) 

THEN  {pop  the  goal  bom  the  goal  stack) 

V  {the  goal  stack  is  an^ty) 

THEN  {fMamthcaasant) 


52 


References 


Anderson,  J.  A.  (1990).  Hyixid  conqwtation  in  cognitive  science:  Neural  networks 
and  symbols.  Applied  Cognitive  Psychology,  4,  337-347. 

Berdter,  C.  (1991).  Implications  of  connectionism  f(»'  thinking  about  rules. 
Educational  Researcher,  20.  10-16. 

Johnson-Laird,  P.  (1988).  The  computer  and  the  mind.  Camixidge,  MA:  Harvard 
University  Press. 

Marshall,  Sandra  P.  (1990).  The  assessment  of  schema  knowledge  fcx  arithmetic 
story  problems:  A  cognitive  science  perspective.  In  G.  Kulm  (Ed.).  Assessing 
higher  order  thinking  in  mathematics.  Washington,  D.C.:  AAAS. 

Marshall,  Sandra  P.  (in  press).  Assessing  schema  knowledge.  In  N.  Frederiksen,  R. 
Mislevy,  &  L  Bejar  (Eds.),  Test  Theory  for  a  New  Generation  of  Tests. 
Hillsdale,  NJ:  Lawrence  Erlbaum  Associates,  [a] 

Marshall.  Sandra  P.  (in  press).  Assessment  of  rational  number  understanding:  A 
schema-based  i^tproach.  In  T.  Carpenter,  E.  Fennema,  &  T.  Romberg,  Rational 
Numbers:  An  Integration  of  Research.  Hillsdale,  NJ:  Lawrence  Eribaum 
Associates,  [b] 

Marshall,  Sandra  P.  (in  press).  Statistical  aixl  cognitive  models  of  learning  through 
instruction.  To  ^tpear  in  Meyrowitz,  A.  L.  &  Chipman,  S.  (Eds.),  Cognitive 
Models  cfCon^lex  Learning.  Norwell,  MA:  Kluwer  Academic  Publishers,  [c] 

Marshall,  Sandra  P.,  Barthuli,  Kathryn  E,  Brewer,  Margaret  A..  &  Rose,  Frederic 
E.  (1989).  ^ORY  PROBLEM  SOLVER:  A  schema-based  system  of 
instruction.  Technical  Rep.  CRMSE  89-01  (ONR  Contract  N00014'8S-K- 
0661),  San  Diego:  San  EMego  State  University. 

Marshall,  Sandra  P.  (1991).  Computer-Based  Assessment  cf  Schema  Knowledge  in 
a  Flexible  Problem-Solving  Environment.  Tedmical  Rep.  CRMSE  91-01  (ONR 
Grant  N00014-89-J-1 143),  San  Diego:  San  Diego  State  University. 


McClelland,  J.  L.,  &  Rumelbart,  D.  E.  (1989).  Explorations  in  parallel  distributed 
processing:  A  handbook  of  models,  programs,  and  exercises.  Cambridge,  MA: 
TbeMIT  Press. 

MinslQ',  M.  (1991).  Logical  veisus  analogical  or  symbolic  versus  CQnnecti<mist  ot 
mat  versus  scruffy.  A1  Magazine,  Summer  1991,  35-51. 

Rumelbart,  D.  E.,  Hinton,  G.  £.,  &  Williams,  R.  J.  (1986).  Learning  intonal 
rqjresentations  by  erm'  propagation.  In  Rumelbart,  D.  E.  &  McQelland,  J.  L. 
(Eds.),  Parallel  ^tributed  processing.  Volume  1:  Foundations  (pp.  318-362). 
Cambridge,  MA:  Ibe  MIT  Press. 

Scbneider,  W.  &  Oliver,  W.  L.  (in  press).  An  instiuctaUe  oonnectionist/control 
architecture:  Using  rule-based  instructicms  to  accomplish  connectionist  learning 
in  a  human  time  scale.  To  appear  in  K.  VanLdm  OEd.),  Ardiitectures  for 
intelligence.  Ifillsdale,  NJ:  Eribaum  Assod^es. 

Winger,  E.  (1987).  Art^cial  intelligence  and  tutoring  systems.  Los  Altos,  CA: 
Morgan  Kaufinann  Publishers,  Inc. 


54 


L 


